Skip to content

Commit

Permalink
Add v1.1.47.dev (#117)
Browse files Browse the repository at this point in the history
* Add v1.1.47.dev

* Add multi-response extension

* Add reshaping of y to C-order

* Add reshaping of C-order

* Add convex_relu matrix with full tests

* Simplify prereq omp instruction

* Add gcc fixes

* Add a message about performance for constraint.linear

* Add correct coercing function that doesn't copy when not necessary

* Add default nnls_tol to 1e-14 based on simulation study, add solve_zero to constraints, save all dual variables (including non-active), parallelize abs_grad computation.

* Add solve_zero to trampoline class

* Refactor dual saving to outside pin methods so that diagnostic works properly, separate nnls from hinge, add solve_zero optimized and corrected for linear constraint

* Add gcc fixes

* Add fixes to pin states to not pass dual_groups, add paranthesis in ternary, update zero_constraint with solve_zero,

* Move clearing of constraint objects to python level since state creation will modify them so clearing them afterwards in the C++ solver will undo some of the work.

* Add it to cov method

* Add glm's _coerce_dtype again to constraint since enforcement of "C" ordering makes sense, add comment about this ordering for linear constraint.

* Add gcc fixes

* Remove bcd constrained proximal, change NNLS API so that it removes the need for KKT check, refactor constraint linear, warm-start min_mu_resid with SVD solution, prune within tolerance 1e-16, dynamic omp for abs_grad update

* Add max batch size as default, add sparse option for convex_relu

* Add gcc fixes

* Add gcc fixes

* Remove svd warm-start and solve NNLS mostly for active groups

* Change sparse matrices cov method to use dynamic omp scheduler

* Add current work

* Change sp_btmul to sp_tmul for consistency

* Add constraint base and dense matrices, generalize ad.constraint.linear (missing sparse), WIP sparse constraint matrix, update NNLS and HingeLowRank to support general constraint-like matrices, update test.

* Change default hinge_max_iters and hinge_tol values

* Add sparse constraint matrix, remove svd comments, add sparse constraint matrix test

* Add optimization to omit dual saving if 0 nnz

* Add static abs_grad update if no non-trivial constraint exists

* Add test code for nnls path solver, and constraint linear with no debug

* Resolve tolerance argument to GLM fit function for multi and single response as well as Gaussian naive (finally!)

* Add current BVLS solution

* Add gcc fix

* Add see also page for bvls solver

* Change default tol for bvls, add tests

* Remove svd, cs_tol; integrate new NNLS; add sorted violations in HingeLowRank; add more rigorous testing for constraints loose, tight bounds

* Add default nnls max iters to 1e6

* Remove unnecessary nnqp_low_rank, make constraint matrices single threaded, and remove TODO for ADMM

* Add hinge variance scaler for better convergence metric.

* Change hinge_tol to 1e-7 to be safe

* Add numerical stability to variance prox newton

* Add thread safe exception handling, clean up solve_zero, abstract out 1e-16 pruning constant, active_vars is capped below at 0 and solver checks vk now

* Integrate pinball into linear constraint solver

* Remove all kkt_tol since it makes the fit pretty bad, remove unnecessary data for each solver accordingly, make bvls and pinball use hard thresholding of gradient

* Remove A_vars and make pinball completely general

* Add docs fixes

* Add fixes to constraint.py

* Add fixes to solver.py

* Add fixes

* Add fix to comments in nnls

* Add fix to constraint matrix docs

* Scikit-style learner  (#123)

* scikit-style class

* move kwargs to fit method for sklearn compat

* add logit example

* predict method updates for classifier glms

* Add sklearn API docs

* Update sklearn api docs, implementation detail, and revamp the notebook

* Add v1.1.47, remove 1 line

* Add gcc fixes

* Add missing gcc fix

* Add a possible msvc fix

* Add msvc fix to capture as arguments

* Add parallel compilation only for non-windows

* Remove parallel build check to prep for refactoring templates

* Remove unnecessary bcd tools and correctly name matrix_utils_blas.cpp

* Add thread restriction

* Try again with msvc flags

* Add back windows parallel switch

* Try removing GL

* Try manually setting MP to 1

* Add ccache to environment.yml, add src directory for core instantiations, add windows parallel build, refactor io and matrix with source files

* Change decl.hpp to py_decl.hpp

* Add py_io, py_matrix, refactor matrix completely so clean omg

* Refactor io completely

* Refactor glm completely

* Refactor constraint completely

* Refactor (trivial) bcd, configs, optimization; everything is inlined that there is no need to split into hpp/ipp and do explicit instantiations

* Reformat template parameters

* Remove macros.hpp wherever not needed, reformat TPs, refactor state and solver (core) completely

* Add a fix to use of Eigen::Map<SparseVector>, integrate solve() as a member fn for each state, completely refactor state

* Refactor iniitalize() to ipp and fix explicit instantiations for missing pin methods (internal solvers use safe_bool_t!)

* Remove test code in setup.py

* Export bvls and pinball, and add gaussian_cov tests

* Add cmath to glm_binomial for msvc

* Add explicit use math defines macro flag

* Disable error on 4996

* Dumb microsoft warning hider

---------

Co-authored-by: Apoorva Lal <lal.apoorva@gmail.com>
  • Loading branch information
JamesYang007 and apoorvalal authored Oct 19, 2024
1 parent 094144e commit 3f692e9
Show file tree
Hide file tree
Showing 213 changed files with 19,419 additions and 11,076 deletions.
9 changes: 7 additions & 2 deletions adelie/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# - Leda Guin
# - Ginnie Guin

__version__ = "1.1.46"
__version__ = "1.1.47"

# Set environment flags before loading adelie_core
import os
Expand Down Expand Up @@ -41,7 +41,12 @@
from .cv import (
cv_grpnet,
)
from .sklearn import (
GroupElasticNet
)
from .solver import (
gaussian_cov,
bvls,
gaussian_cov,
grpnet,
pinball,
)
141 changes: 70 additions & 71 deletions adelie/constraint.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
from . import adelie_core as core
from . import matrix
from .adelie_core.constraint import (
ConstraintBase32,
ConstraintBase64,
)
from .configs import Configs
from .glm import _coerce_dtype
from . import adelie_core as core
from .matrix import (
MatrixConstraintBase32,
MatrixConstraintBase64,
)
from scipy.sparse import csr_matrix
from typing import Union
import numpy as np

Expand Down Expand Up @@ -47,15 +53,12 @@ def box(
tol : float, optional
Convergence tolerance for proximal Newton.
Default is ``1e-9``.
nnls_max_iters : int, optional
Maximum number of non-negative least squares iterations.
pinball_max_iters : int, optional
Maximum number of coordinate descent iterations for the pinball least squares solver.
Default is ``int(1e5)``.
nnls_tol : float, optional
Maximum number of non-negative least squares iterations.
Default is ``1e-9``.
cs_tol : float, optional
Complementary slackness tolerance.
Default is ``1e-9``.
pinball_tol : float, optional
Convergence tolerance for the pinball least squares solver.
Default is ``1e-7``.
slack : float, optional
Slackness for backtracking when proximal Newton overshoots
the boundary where primal is zero.
Expand Down Expand Up @@ -85,8 +88,8 @@ def box(
See Also
--------
adelie.adelie_core.constraint.ConstraintBoxProximalNewton32
adelie.adelie_core.constraint.ConstraintBoxProximalNewton64
adelie.adelie_core.constraint.ConstraintBox32
adelie.adelie_core.constraint.ConstraintBox64
"""
lower, l_dtype = _coerce_dtype(lower, dtype)
upper, u_dtype = _coerce_dtype(upper, dtype)
Expand All @@ -98,8 +101,8 @@ def box(

core_base = {
"proximal-newton": {
np.float32: core.constraint.ConstraintBoxProximalNewton32,
np.float64: core.constraint.ConstraintBoxProximalNewton64,
np.float32: core.constraint.ConstraintBox32,
np.float64: core.constraint.ConstraintBox64,
},
}[method][dtype]

Expand All @@ -108,9 +111,8 @@ def box(
"proximal-newton": {
"max_iters": 100,
"tol": 1e-9,
"nnls_max_iters": int(1e5),
"nnls_tol": 1e-9,
"cs_tol": 1e-9,
"pinball_max_iters": int(1e5),
"pinball_tol": 1e-7,
"slack": 1e-4,
},
}[method]
Expand All @@ -133,11 +135,10 @@ def __init__(self):


def linear(
A: np.ndarray,
A: Union[np.ndarray, csr_matrix, MatrixConstraintBase32, MatrixConstraintBase64],
lower: np.ndarray,
upper: np.ndarray,
*,
svd: tuple =None,
vars: np.ndarray =None,
copy: bool =False,
method: str ="proximal-newton",
Expand All @@ -148,23 +149,22 @@ def linear(
The linear constraint is given by :math:`\\ell \\leq A x \\leq u`
where :math:`\\ell \\leq 0 \\leq u`.
This constraint object is intended to support general linear constraints.
If :math:`A` is structured (e.g. box constraint),
it is more efficient to use specialized constraint types.
Parameters
----------
A : (m, d) ndarray
A : (m, d) Union[ndarray, csr_matrix, MatrixConstraintBase32, MatrixConstraintBase64]
Constraint matrix :math:`A`.
lower : (m,) ndarray
Lower bound :math:`\\ell`.
upper : (m,) ndarray
Upper bound :math:`u`.
svd : tuple, optional
A tuple ``(u, d, vh)`` as outputted by :func:`numpy.linalg.svd`.
However, ``u`` must have ``"F"``-ordering.
If ``None``, it is computed internally.
Default is ``None``.
vars : ndarray, optional
Equivalent to ``np.sum(A ** 2, axis=1)``.
If ``None``, it is computed internally.
Equivalent to :math:`\\mathrm{diag}(AA^\\top)`.
If ``None`` and ``A`` is ``ndarray`` or ``csr_matrix``, it is computed internally.
Otherwise, it must be explicitly provided by the user.
Default is ``None``.
copy : bool, optional
If ``True``, a copy of the inputs are stored internally.
Expand All @@ -188,18 +188,18 @@ def linear(
tol : float, optional
Convergence tolerance for proximal Newton.
Default is ``1e-9``.
nnls_batch_size : int, optional
Batch size of active dual variables to include at a time.
Default is ``10``.
nnls_max_iters : int, optional
Maximum number of non-negative least squares iterations.
Maximum number of coordinate descent iterations for the non-negative least squares solver.
Default is ``int(1e5)``.
nnls_tol : float, optional
Maximum number of non-negative least squares iterations.
Default is ``1e-9``.
cs_tol : float, optional
Complementary slackness tolerance.
Default is ``1e-9``.
Convergence tolerance for the non-negative least squares solver.
Default is ``1e-7``.
pinball_max_iters : int, optional
Maximum number of coordinate descent iterations for the pinball least squares solver.
Default is ``int(1e5)``.
pinball_tol : float, optional
Convergence tolerance for the pinball least squares solver.
Default is ``1e-7``.
slack : float, optional
Slackness for backtracking when proximal Newton overshoots
the boundary where primal is zero.
Expand Down Expand Up @@ -232,31 +232,38 @@ def linear(
See Also
--------
adelie.adelie_core.constraint.ConstraintLinearProximalNewton32
adelie.adelie_core.constraint.ConstraintLinearProximalNewton64
adelie.adelie_core.constraint.ConstraintLinear32
adelie.adelie_core.constraint.ConstraintLinear64
"""
A, A_dtype = _coerce_dtype(A, dtype)
if isinstance(A, np.ndarray):
if vars is None:
vars = np.sum(A ** 2, axis=1)

A, _ = _coerce_dtype(A, dtype)
A = matrix.dense(A, method="constraint", copy=copy)
elif isinstance(A, csr_matrix):
if vars is None:
vars = (A ** 2).sum(axis=1)

A = matrix.sparse(A, method="constraint", copy=copy)
else:
assert not (vars is None)

A_dtype = np.float32 if "32" in A.__class__.__name__ else np.float64

lower, l_dtype = _coerce_dtype(lower, dtype)
upper, u_dtype = _coerce_dtype(upper, dtype)
assert A_dtype == l_dtype
assert A_dtype == u_dtype
dtype = A_dtype

if svd is None:
A_u, A_d, A_vh = np.linalg.svd(A, full_matrices=False, compute_uv=True)
A_u = np.asfortranarray(A_u)
svd = (A_u, A_d, A_vh)

if vars is None:
vars = np.sum(A ** 2, axis=1)

lower = np.minimum(-lower, Configs.max_solver_value)
upper = np.minimum(upper, Configs.max_solver_value)

core_base = {
"proximal-newton": {
np.float32: core.constraint.ConstraintLinearProximalNewton32,
np.float64: core.constraint.ConstraintLinearProximalNewton64,
np.float32: core.constraint.ConstraintLinear32,
np.float64: core.constraint.ConstraintLinear64,
},
}[method][dtype]

Expand All @@ -265,10 +272,10 @@ def linear(
"proximal-newton": {
"max_iters": 100,
"tol": 1e-9,
"nnls_batch_size": 10,
"nnls_max_iters": int(1e5),
"nnls_tol": 1e-9,
"cs_tol": 1e-9,
"nnls_tol": 1e-7,
"pinball_max_iters": int(1e5),
"pinball_tol": 1e-7,
"slack": 1e-4,
"n_threads": 1,
},
Expand All @@ -279,19 +286,15 @@ def linear(

class _linear(core_base):
def __init__(self):
self._A = np.array(A, copy=copy, dtype=dtype, order="C")
self._A = A
self._lower = np.array(lower, dtype=dtype)
self._upper = np.array(upper, dtype=dtype)
self._svd = tuple(np.array(x, copy=copy, dtype=dtype) for x in svd)
self._vars = np.array(vars, copy=copy, dtype=dtype)
core_base.__init__(
self,
A=self._A,
lower=self._lower,
upper=self._upper,
A_u=self._svd[0],
A_d=self._svd[1],
A_vh=self._svd[2],
A_vars=self._vars,
**configs,
)
Expand Down Expand Up @@ -371,15 +374,12 @@ def one_sided(
tol : float, optional
Convergence tolerance for proximal Newton.
Default is ``1e-9``.
nnls_max_iters : int, optional
Maximum number of non-negative least squares iterations.
pinball_max_iters : int, optional
Maximum number of coordinate descent iterations for the pinball least squares solver.
Default is ``int(1e5)``.
nnls_tol : float, optional
Maximum number of non-negative least squares iterations.
Default is ``1e-9``.
cs_tol : float, optional
Complementary slackness tolerance.
Default is ``1e-9``.
pinball_tol : float, optional
Convergence tolerance for the pinball least squares solver.
Default is ``1e-7``.
slack : float, optional
Slackness for backtracking when proximal Newton overshoots
the boundary where primal is zero.
Expand Down Expand Up @@ -425,16 +425,16 @@ def one_sided(
--------
adelie.adelie_core.constraint.ConstraintOneSidedADMM32
adelie.adelie_core.constraint.ConstraintOneSidedADMM64
adelie.adelie_core.constraint.ConstraintOneSidedProximalNewton32
adelie.adelie_core.constraint.ConstraintOneSidedProximalNewton64
adelie.adelie_core.constraint.ConstraintOneSided32
adelie.adelie_core.constraint.ConstraintOneSided64
"""
b, dtype = _coerce_dtype(b, dtype)
b = np.minimum(b, Configs.max_solver_value)

core_base = {
"proximal-newton": {
np.float32: core.constraint.ConstraintOneSidedProximalNewton32,
np.float64: core.constraint.ConstraintOneSidedProximalNewton64,
np.float32: core.constraint.ConstraintOneSided32,
np.float64: core.constraint.ConstraintOneSided64,
},
"admm": {
np.float32: core.constraint.ConstraintOneSidedADMM32,
Expand All @@ -447,9 +447,8 @@ def one_sided(
"proximal-newton": {
"max_iters": 100,
"tol": 1e-9,
"nnls_max_iters": int(1e5),
"nnls_tol": 1e-9,
"cs_tol": 1e-9,
"pinball_max_iters": int(1e5),
"pinball_tol": 1e-7,
"slack": 1e-4,
},
"admm": {
Expand Down
22 changes: 15 additions & 7 deletions adelie/cv.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,17 +75,20 @@ def plot_loss(self):

def fit(
self,
*,
lmda_path_size: int =100,
X: Union[np.ndarray, MatrixNaiveBase32, MatrixNaiveBase64],
glm: Union[GlmBase32, GlmBase64, GlmMultiBase32, GlmMultiBase64],
**grpnet_params,
):
"""Fits group elastic net until the best CV :math:`\\lambda`.
Parameters
----------
lmda_path_size : int, optional
Number of regularizations in the path.
Default is ``100``.
X : (n, p) Union[ndarray, MatrixNaiveBase32, MatrixNaiveBase64]
Feature matrix.
It is typically one of the matrices defined in :mod:`adelie.matrix` submodule or :class:`numpy.ndarray`.
glm : Union[GlmBase32, GlmBase64, GlmMultiBase32, GlmMultiBase64]
GLM object.
It is typically one of the GLM classes defined in :mod:`adelie.glm` submodule.
**grpnet_params : optional
Parameters to :func:`adelie.solver.grpnet`.
Expand All @@ -101,18 +104,23 @@ def fit(
logger_level = logger.logger.level
logger.logger.setLevel(logger.logging.ERROR)
state = grpnet(
X=grpnet_params["X"],
glm=grpnet_params["glm"],
X=X,
glm=glm,
lmda_path_size=0,
progress_bar=False,
)
logger.logger.setLevel(logger_level)

lmda_path_size = 100
if "lmda_path_size" in grpnet_params:
lmda_path_size = grpnet_params["lmda_path_size"]
lmda_star = self.lmdas[self.best_idx]
full_lmdas = state.lmda_max * np.logspace(
0, np.log10(lmda_star / state.lmda_max), lmda_path_size
)
return grpnet(
X=X,
glm=glm,
lmda_path=full_lmdas,
early_exit=False,
**grpnet_params,
Expand Down
Loading

0 comments on commit 3f692e9

Please sign in to comment.