-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Twice differentiable LASSO #36
Comments
More on SCAD and what the penalty looks like |
Below is an experimental example of dLASSO import numpy as np
import pandas as pd
from delicatessen import MEstimator
from delicatessen.estimating_equations import ee_regression, ee_ridge_regression
from delicatessen.utilities import standard_normal_cdf, standard_normal_pdf
np.random.seed(8509353)
n = 50
d = pd.DataFrame()
d['X1'] = np.random.normal(size=n)
d['X2'] = np.random.normal(size=n)
d['X3'] = np.random.normal(size=n)
d['X4'] = np.random.normal(size=n)
d['X5'] = np.random.normal(size=n)
d['X6'] = np.random.normal(size=n)
d['X7'] = np.random.normal(size=n)
d['X8'] = np.random.normal(size=n)
d['X9'] = np.random.normal(size=n)
d['Y'] = 5 + d['X1'] - 0.5*d['X2'] + np.random.normal(size=n)
d['I'] = 1
x_cols = ['I', 'X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7', 'X8', 'X9']
def psi(theta):
beta = np.asarray(theta)[:, None]
y = np.asarray(d['Y'])[:, None]
X = np.asarray(d[x_cols])
pred_y = np.dot(X, beta)
s = 1e-5
penalty = np.asarray([0., ] + [5., ]*9) / 50
penalty_terms = penalty[:, None] * (2*standard_normal_cdf(beta/s) + 2*(beta/s)*standard_normal_pdf(beta/s) - 1)
ee_reg = (((y - pred_y) * X).T - penalty_terms)
return ee_reg
estr = MEstimator(psi, init=[0., 1., -0.5, ] + [0., ]*7)
estr.estimate(deriv_method='exact')
def psi(theta):
return ee_ridge_regression(theta=theta, y=d['Y'], X=d[x_cols], model='linear', penalty=[0, ] + [5., ]*9)
estr = MEstimator(psi, init=[0., ]*len(x_cols))
estr.estimate(deriv_method='exact') I still need to understand the operating characteristics better. There is also a dSCAD that can be implemented. |
Okay, I was not too excited about the dLASSO but I am coming around a bit. I think it solves some of the issues of the standard LASSO in this context. My continuing hesitation is that there is a single paper on this, and I'm not sure how well vetted the theory is. It is probably fine, but I am going to have to study it more myself... |
Is your feature request related to a problem? Please describe.
The current LASSO technically is invalid with the sandwich. That is because the score is not differentiable at zero. There are alternatives penalties, that mimic LASSO, where the score is differentiable everywhere. These might be better to include as a sort of default option (other LASSO will be maintained, but would be good to have a valid version as the default).
Describe the solution you'd like
There are various different penalties that could be added, beside the bridge penalty. For LASSO specifically, the dLASSO is one option
https://arxiv.org/pdf/1609.04985.pdf
The only issue is that this penalty will be biased. Some SCAD-dLASSO hybrid would be ideal, as it would solve both the twice differentiable and unbiased part of the penalization. I haven't seen this proposed, so would either need to find such a paper or I would need to put out some methods paper on it first.
Describe alternatives you've considered
Leave as-is. Don't add anymore penalized regression.
Additional context
None
The text was updated successfully, but these errors were encountered: