Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider parallelizing xval #523

Open
mulhod opened this issue Sep 25, 2019 · 0 comments
Open

Consider parallelizing xval #523

mulhod opened this issue Sep 25, 2019 · 0 comments
Assignees

Comments

@mulhod
Copy link
Contributor

mulhod commented Sep 25, 2019

Cross-validation runs serially (grid search cross-validation, however, does make use of threads). This is a considerable bottleneck for large data-sets/large feature spaces. For example, in recent experiments with 15k samples and perhaps up to 100k features, 10-fold cross-validation can take upwards of two weeks. It would be a good idea to consider parallelizing at the cross-validation fold-level, if possible. For example, perhaps each fold can be gridmaped individually or folds can be run in threads (however, as mentioned, grid search cross-validation already spawns 3 threads, so that would have to be kept in mind).

@aoifecahill aoifecahill self-assigned this Sep 27, 2019
@aoifecahill aoifecahill added this to the v2.1 milestone Sep 27, 2019
@desilinguist desilinguist removed this from the v2.5 milestone Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants