Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Allowing trees to bin data #25

Closed
wants to merge 23 commits into from
Closed

[DRAFT] Allowing trees to bin data #25

wants to merge 23 commits into from

Conversation

adam2392
Copy link
Collaborator

Reference Issues/PRs

Fixes: #23

What does this implement/fix? Explain your changes.

This aims to implement the binning capabilities to massively improves the speed of training decision trees. Currently, this is trying to add the binning capabilities that play well with the existing codebase.

Unfortunately, the code from #24 is far from complete and does not do the job.

Right now, what is missing is:

  • how to implement binning that is consistent in fitting and predicting (i.e. apply) API
  • can we simplify the API?

Any other comments?

Jofleming and others added 23 commits June 17, 2022 08:47
Fixed link to Jake Vanderplas website.
* Doc fix link for random projections

* Doc fix link for random projections
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Update reference link to "A survey of Partial Least Squares (PLS) methods, with emphasis on the two-block case" by JA Wegelin
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
…c functions (scikit-learn#23514)

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
@adam2392 adam2392 mentioned this pull request Mar 28, 2023
adam2392 added a commit that referenced this pull request Mar 28, 2023
#### Reference Issues/PRs
Closes: #25 
Closes: #23 


#### What does this implement/fix? Explain your changes.
Adds preliminary capability for binning features. The cons is we need to
"bin" again during predict time.

I've documented how we can get around this in the README, but it will involve some
heavy-duty Cython coding.

- Remove Oblique tree models and migrated to scikit-tree
- Fix up CI so that way unnecessary workflows are not ran on the fork
- Updated the documentation with the current limitation of binning

---------

Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392 adam2392 closed this Mar 29, 2023
@adam2392 adam2392 deleted the bin branch March 29, 2023 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] Adding binning capabilities to decision trees