Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when including continuous covariate in PyTwoWay #57

Open
jakobbrounstein opened this issue Apr 1, 2024 · 3 comments
Open

Issue when including continuous covariate in PyTwoWay #57

jakobbrounstein opened this issue Apr 1, 2024 · 3 comments

Comments

@jakobbrounstein
Copy link

Hello,

Thank you for developing this package! I’m encountering an issue when running pytwoway while including a continuous covariate. My goal is to get the KSS unbiased estimator of the variance components of the AKM fixed effects. The procedure (FEControlEstimator) runs fine when I exclude the continuous covariate. However, I get the following error when I include a continuous covariate:

Traceback (most recent call last):
File "/usr/local/linux/anaconda3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 46, in
fe_estimator.fit()
File "/accounts/grad/jakobbrounstein/.local/lib/python3.8/site-packages/pytwoway/fecontrol.py", line 354, in fit
self._prep_matrices()
File "/accounts/grad/jakobbrounstein/.local/lib/python3.8/site-packages/pytwoway/fecontrol.py", line 623, in _prep_matrices
pcd_operator = pcd.ichol(AtDpA.tocsc(), **pcd_options)
File "/accounts/grad/jakobbrounstein/.local/lib/python3.8/site-packages/pytwoway/preconditioners/ichol.py", line 297, in ichol
raise ValueError(
ValueError: Thresholded incomplete Cholesky decomposition failed due to insufficient positive-definiteness of matrix A and diagonal shifts did not help.

I can run vanilla AKM just fine with the continuous covariate in Stata, there are no multicollinearity issues as far as I know. I don’t know why there would be a singular matrix here when I run this in Python, as when I run a TWFE model in Stata with the covariate, there are no issues.

Does FEControlEstimator normally see any issues in accommodating continuous covariates?

Sincerely,

Jakob

@adamoppenheimer
Copy link
Collaborator

Hi Jakob,

Thank you for using PyTwoWay and for reaching out!

The issue you reference is with the preconditioner, not the solver. The solver is actually iterative, so it doesn't check if the matrix is full rank. I'm planning to add an option to check if the matrix is full rank over the summer.

I recommend trying out the different preconditioner options to find one that works. You should specify 'preconditioner': [preconditioner] in your FE control parameters.

The preconditioner options are:

  • None
  • 'jacobi'
  • 'vcycle'
  • 'ichol'
  • 'ilu'

The default is 'ichol', which is the incomplete Cholesky decomposition, which seems not to work for your data. I would recommend trying 'jacobi' next. You could always try disabling the preconditioner with None, although this could be slow. I'm not sure how well 'ilu' and 'vcycle' will work, but feel free to try them.

In addition, you can always try out different solvers to see which is the fastest. In case you want to avoid preconditioners entirely, setting 'solver': 'amg' in your FE control parameters will use the AMG solver which doesn't use a preconditioner. This is also Professor Lamadon's preferred solver for very large datasets.

Please let me know if this doesn't resolve your issue, and also please feel free to reach out again if you have any more questions or issues with the code!

Best,
Adam

@jakobbrounstein
Copy link
Author

Thank you for your response!

I tried to use solver “‘amg”, but I received this error:
_pickle.PicklingError: Can't pickle <class 'pyamg.multilevel.coarse_grid_solver..GenericSolver'>:

However, this object is not found as “pyamg.multilevel.coarse_grid_solver..GenericSolver”.

I’m very sorry to bother, but I am just not sure which pickle is missing. Do you have an idea what causes this issue?

Best,

Jakob

@adamoppenheimer
Copy link
Collaborator

Hi Jakob,

That sounds like an issue with multiprocessing. I think there are two good solutions.

First, you could install the multiprocess package so that it uses dill instead of pickle.

Alternatively, you can run the code with 'ncore': 1. If speed is an issue then you probably want multiprocessing though.

Please let me know if this resolves the issue!

Best,
Adam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants