Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH add partial_fit for DecisionTreeClassifier #50

Closed
wants to merge 147 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
147 commits
Select commit Hold shift + click to select a range
1542765
Start implementing the update function for trees
PSSF23 Nov 7, 2020
8ded0f7
Update _tree.pxd
PSSF23 Nov 7, 2020
d6d5879
Remove unused attribute
PSSF23 Nov 7, 2020
0ed0819
Remove duplicate operations
PSSF23 Nov 7, 2020
bebe2bc
Keep whole function for reference
PSSF23 Nov 8, 2020
6ca6725
Catch AttributeError
PSSF23 Nov 11, 2020
a403f5b
Evaluate tree building logic
PSSF23 Nov 15, 2020
cb4cf43
Follow node addition logic
PSSF23 Nov 15, 2020
eb7af31
Work with counting issues and overflowing trees
PSSF23 Nov 15, 2020
c24c87a
Work with high variability
PSSF23 Nov 16, 2020
5e6685c
Fix y coordinates
PSSF23 Nov 16, 2020
5f6c373
Duplicate sample organization
PSSF23 Nov 18, 2020
7ac15f2
Add _update_split_node function for BestFirstTree
PSSF23 Nov 18, 2020
2a94fa2
Work without max_leaf_nodes limit
PSSF23 Nov 18, 2020
d6c03a7
Update .gitignore
PSSF23 Nov 18, 2020
7a3985a
Remove capacity resetting
PSSF23 Nov 29, 2020
4f8605e
Resolve 1 node tree problem
PSSF23 Dec 7, 2020
11764a1
Optimize node order
PSSF23 Dec 20, 2020
02ca737
Update _tree.pyx
PSSF23 Jan 18, 2021
92f7e18
Optimize partial_fit api
PSSF23 Jan 21, 2021
ab51a53
Update from main branch to stream branch
PSSF23 Feb 2, 2021
f05a3b2
Fix linting
PSSF23 Feb 2, 2021
e1b6658
FIX add __reduce__ functions
PSSF23 Sep 14, 2021
f1a4174
Merge branch 'main' into stream
PSSF23 Sep 14, 2021
0a5420c
FIX black format the code
PSSF23 Sep 14, 2021
19893c3
FIX remove min_impurity_split
PSSF23 Sep 14, 2021
fdd1dfd
FIX update deprecated attribute
PSSF23 Sep 14, 2021
b4cbfa4
FIX optimize api & correct __cinit__
PSSF23 Sep 14, 2021
8f4b664
FIX optimize first partial_fit test
PSSF23 Sep 14, 2021
3562219
FIX remove FutureWarning filter
PSSF23 Sep 14, 2021
93ead2d
FIX modify partial_fit parameter
PSSF23 Sep 14, 2021
bfaa18c
FIX correct partial_fit parameter
PSSF23 Sep 14, 2021
73779c2
Revert "FIX remove FutureWarning filter"
PSSF23 Sep 15, 2021
7d724c1
FIX prevent feature number reset
PSSF23 Sep 15, 2021
d3f15ad
MAINT remove duplicate category
PSSF23 Sep 15, 2021
992e34a
FIX correct regressor partial_fit checks
PSSF23 Sep 15, 2021
2a72c8f
Revert "MAINT remove duplicate category"
PSSF23 Sep 15, 2021
68d2d7b
FIX change parameter order
PSSF23 Sep 15, 2021
631a953
DOC add classes parameter docstring
PSSF23 Sep 15, 2021
9ae93b8
EHN pass classes into first fit
PSSF23 Sep 15, 2021
23fd392
FIX add class indices
PSSF23 Sep 15, 2021
665ceef
FIX revert class changes
PSSF23 Sep 15, 2021
85689d2
EHN pass classes into first fit
PSSF23 Sep 15, 2021
71265db
FIX restrict partial_fit to classifiers
PSSF23 Sep 16, 2021
005e5fe
Merge branch 'scikit-learn:main' into stream
PSSF23 Oct 13, 2021
cd8864e
Merge branch 'scikit-learn:main' into stream
PSSF23 Oct 18, 2021
30a6237
Merge branch 'scikit-learn:main' into stream
PSSF23 Oct 21, 2021
814e67e
DOC add changelog
PSSF23 Oct 21, 2021
46a9ccc
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
PSSF23 Nov 5, 2021
f0d0eb0
DOC optimize log format
PSSF23 Nov 17, 2021
e9e62e4
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
PSSF23 Nov 17, 2021
aef4f84
FIX remove deprecated parameter
PSSF23 Nov 17, 2021
7d9ff8b
Merge branch 'main' into stream
PSSF23 Nov 26, 2021
9ba887f
Merge branch 'scikit-learn:main' into stream
PSSF23 Nov 29, 2021
55a6b4b
FIX optimize n_classes format
PSSF23 Nov 30, 2021
8d3f5c7
FIX add internal function
PSSF23 Nov 30, 2021
c47bbb2
MNT remove unnecessary checks
PSSF23 Dec 1, 2021
a2aab5f
Merge branch 'scikit-learn:main' into stream
PSSF23 Dec 2, 2021
0ac8a71
[EXAMPLE DIFF] (Tree featuresv2) Fork of sklearn that maintains all n…
adam2392 Mar 29, 2023
927d2a9
Merge branch 'scikit-learn:main' into fork
adam2392 Mar 29, 2023
475bd05
Docs (#39)
adam2392 Mar 29, 2023
73a2176
Merge branch 'scikit-learn:main' into fork
adam2392 Mar 29, 2023
f8d4697
Merge branch 'scikit-learn:main' into fork
adam2392 Mar 30, 2023
706a742
Release v1.2.2
adam2392 Apr 4, 2023
db58884
Merge branch 'scikit-learn:main' into fork
adam2392 Apr 4, 2023
1efc7a8
Merge branch 'scikit-learn:main' into fork
adam2392 Apr 6, 2023
a22db03
Update README
adam2392 Apr 11, 2023
d3b6175
Merge branch 'scikit-learn:main' into fork
adam2392 Apr 11, 2023
be859f3
Merge branch 'scikit-learn:main' into fork
adam2392 Apr 12, 2023
e6853ba
Merge branch 'scikit-learn:main' into fork
adam2392 Apr 13, 2023
13534a6
Merge branch 'scikit-learn:main' into fork
adam2392 Apr 17, 2023
bc1e387
ENH start modularizing root initialization
PSSF23 Apr 18, 2023
c41c983
Merging
adam2392 Apr 19, 2023
fb5b69a
Merge branch 'fork' of https://github.com/neurodata/scikit-learn into…
adam2392 Apr 19, 2023
31723a6
Merge branch 'scikit-learn:main' into fork
adam2392 Apr 20, 2023
12d45d3
Merging
adam2392 Apr 28, 2023
1837ae8
Merge branch 'main' into submodule
adam2392 Apr 28, 2023
9c5321d
Adding working submodule
adam2392 Jun 8, 2023
b2544c5
Merge branch 'main' into submodulev2
adam2392 Jun 8, 2023
f82f258
Merged main
adam2392 Jun 8, 2023
7e38502
Successful merge with the missing value support
adam2392 Jun 8, 2023
34a5621
Add cyton headers
adam2392 Jun 8, 2023
f35c758
Fix imports to be absolute
adam2392 Jun 8, 2023
45320b4
Fix forest import
adam2392 Jun 8, 2023
6b4d0e7
Merge branch 'main' into submodulev2
adam2392 Jun 12, 2023
5a2ac9a
Merge branch 'scikit-learn:main' into submodulev2
adam2392 Jun 13, 2023
49526f0
Fix classes and criterion
adam2392 Jun 13, 2023
2105949
Working..
adam2392 Jun 13, 2023
9b07f2a
Add leaf storage ability
adam2392 Jun 13, 2023
21ccb30
[ENH] Adding leaf node samples to be stored when "quantile" tree is t…
adam2392 Jun 15, 2023
545e2a2
Merge branch 'main' into submodulev2
adam2392 Jun 15, 2023
855ee19
Add quantile
adam2392 Jun 16, 2023
3b7b450
Merge branch 'submodulev2' of https://github.com/neurodata/scikit-lea…
adam2392 Jun 16, 2023
3f5cb65
Add check input
adam2392 Jun 16, 2023
7401ddc
Try to fix docstring
adam2392 Jun 16, 2023
13e2913
Try to fix docstring
adam2392 Jun 16, 2023
43aa3ef
Fix docstring
adam2392 Jun 17, 2023
fe3072f
Fix docstring
adam2392 Jun 17, 2023
2d4de9a
Fix the predict quantiles docstring
adam2392 Jun 20, 2023
1c1ec8c
Fix the predict quantiles docstring
adam2392 Jun 20, 2023
1994f15
Merging main, but with two test failures
adam2392 Jun 23, 2023
4bc651d
Remove some diff
adam2392 Jun 23, 2023
cc035d0
Fix regression error
adam2392 Jun 23, 2023
4840d4e
Fix boolean
adam2392 Jun 23, 2023
512f34c
Merge branch 'scikit-learn:main' into submodulev2
adam2392 Jun 27, 2023
a6a6b0e
Merge branch 'scikit-learn:main' into submodulev2
adam2392 Jun 30, 2023
fdf2e2d
Added doc to store_leaf_values
adam2392 Jun 30, 2023
be902cc
Merge branch 'submodulev2' of https://github.com/neurodata/scikit-lea…
adam2392 Jun 30, 2023
5b7ce7e
Merging main
adam2392 Jun 30, 2023
9655d01
Fix now
adam2392 Jun 30, 2023
6b57c58
Bring in monotonicity (#47)
adam2392 Jul 5, 2023
df0fae2
Complete merge
adam2392 Jul 5, 2023
34e540a
Fix splitter
adam2392 Jul 5, 2023
0130bb3
ENH draft separate root initialization
PSSF23 Jul 6, 2023
a927669
Merging main
adam2392 Jul 19, 2023
e9d702b
Fix linter
adam2392 Jul 19, 2023
ce6a727
Fix linting
adam2392 Jul 20, 2023
00a3595
Fix docstring
adam2392 Jul 20, 2023
329cbc8
Fix lint
adam2392 Jul 20, 2023
8b5d0f9
Fix unit test
adam2392 Jul 20, 2023
38bade7
Fix lint
adam2392 Jul 20, 2023
feffdeb
Adding fix
adam2392 Jul 20, 2023
2bb5f1c
Fixed
adam2392 Jul 21, 2023
5f68744
ENH attempt on object initialization
PSSF23 Aug 8, 2023
1d1e20d
ENH optimize partial fit for depth builder
PSSF23 Aug 8, 2023
fbf49a5
FIX remove update function in best builder
PSSF23 Aug 9, 2023
87760b6
Merge branch 'submodulev2' into reed
PSSF23 Aug 9, 2023
d83cc1b
ENH update pxd function param
PSSF23 Aug 9, 2023
6c0a3e3
FIX update numpy usage
PSSF23 Aug 9, 2023
db66c7b
Revert "FIX update numpy usage"
PSSF23 Aug 9, 2023
c868e9e
FIX resolve conflicts
PSSF23 Aug 9, 2023
c42a29c
FIX correct nogil location
PSSF23 Aug 9, 2023
af2918f
FIX remove deprecated import
PSSF23 Aug 9, 2023
876285c
FIX correct styles & variable ref
PSSF23 Aug 9, 2023
1599018
FIX remove deprecated import & warnings
PSSF23 Aug 9, 2023
f20a4fa
FIX optimize conditional statement
PSSF23 Aug 9, 2023
f556e0b
FIX optimize cython method
PSSF23 Aug 9, 2023
8b226e7
FIX optimize refitting conditions
PSSF23 Aug 9, 2023
d88dd28
FIX correct type comparison methods
PSSF23 Aug 9, 2023
c894f60
DOC update cython variable
PSSF23 Aug 9, 2023
6907a5f
ENH optimize efficiency & FIX correct param order
PSSF23 Aug 10, 2023
35b2def
FIX correct conditional statement for splitter
PSSF23 Aug 10, 2023
0133ee1
FIX correct splitting start sample position
PSSF23 Aug 10, 2023
d4d677e
FIX remove duplicate method
PSSF23 Aug 10, 2023
6ec023b
[MERGE] Merge changes from sklearn main (#52)
adam2392 Aug 11, 2023
423fa49
Merge branch 'submodulev2' into reed
adam2392 Aug 11, 2023
fcc0758
Merge branch 'submodulev3' into reed
adam2392 Aug 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 17 additions & 16 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,22 +89,23 @@ jobs:
root: doc/_build/html
paths: .

deploy:
docker:
- image: cimg/python:3.8.12
steps:
- checkout
- run: ./build_tools/circle/checkout_merge_commit.sh
# Attach documentation generated in the 'doc' step so that it can be
# deployed.
- attach_workspace:
at: doc/_build/html
- run: ls -ltrh doc/_build/html/stable
- deploy:
command: |
if [[ "${CIRCLE_BRANCH}" =~ ^main$|^[0-9]+\.[0-9]+\.X$ ]]; then
bash build_tools/circle/push_doc.sh doc/_build/html/stable
fi
# XXX: in order to make sure our fork passes all the CIs and not remove too many LOC, we don't want to deploy
# deploy:
# docker:
# - image: cimg/python:3.8.12
# steps:
# - checkout
# - run: ./build_tools/circle/checkout_merge_commit.sh
# # Attach documentation generated in the 'doc' step so that it can be
# # deployed.
# - attach_workspace:
# at: doc/_build/html
# - run: ls -ltrh doc/_build/html/stable
# - deploy:
# command: |
# if [[ "${CIRCLE_BRANCH}" =~ ^main$|^[0-9]+\.[0-9]+\.X$ ]]; then
# bash build_tools/circle/push_doc.sh doc/_build/html/stable
# fi

workflows:
version: 2
Expand Down
4 changes: 2 additions & 2 deletions .cirrus.star
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
load("cirrus", "env", "fs", "http")

def main(ctx):
# Only run for scikit-learn/scikit-learn. For debugging on a fork, you can
# Only run for neurodata/scikit-learn. For debugging on a fork, you can
# comment out the following condition.
if env.get("CIRRUS_REPO_FULL_NAME") != "scikit-learn/scikit-learn":
if env.get("CIRRUS_REPO_FULL_NAME") != "neurodata/scikit-learn":
return []

arm_wheel_yaml = "build_tools/cirrus/arm_wheel.yml"
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/check-changelog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,13 @@ jobs:
check:
name: A reviewer will let you know if it is required or can be bypassed
runs-on: ubuntu-latest
if: ${{ contains(github.event.pull_request.labels.*.name, 'No Changelog Needed') == 0 }}
if: ${{ contains(github.event.pull_request.labels.*.name, 'No Changelog Needed') == 0 && github.repository == 'scikit-learn/scikit-learn' }}
steps:
- name: Get PR number and milestone
run: |
echo "PR_NUMBER=${{ github.event.pull_request.number }}" >> $GITHUB_ENV
echo "TAGGED_MILESTONE=${{ github.event.pull_request.milestone.title }}" >> $GITHUB_ENV
echo "${{ github.repository }}"
- uses: actions/checkout@v3
with:
fetch-depth: '0'
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/check-manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
jobs:
check-manifest:
# Don't run on forks
if: github.repository == 'scikit-learn/scikit-learn'
if: github.repository == 'neurodata/scikit-learn'

runs-on: ubuntu-latest
steps:
Expand Down
27 changes: 27 additions & 0 deletions .github/workflows/check-upstream.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Create Github Actions workflow that checks upstream scikit-learn 'main' branch and
# creates or updates
# an existing pull request to https://github.com/neurodata/scikit-learn:fork.
# Runs the check weekly.
# Creates a pull request if there are changes.

# name: Check upstream scikit-learn

# on:
# schedule:
# - cron: '0 0 * * 0'

# jobs:
# check-upstream:
# runs-on: ubuntu-latest
# steps:
# - uses: actions/checkout@v2
# - name: Check upstream scikit-learn
# uses: neurodata/check-upstream@main
# with:
# upstream: scikit-learn/scikit-learn
# fork: neurodata/scikit-learn
# branch: fork
# token: ${{ secrets.GITHUB_TOKEN }}

# # Creates a pull request if there are changes.

4 changes: 2 additions & 2 deletions .github/workflows/labeler-module.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
steps:
- uses: thomasjpfan/labeler@v2.5.1
continue-on-error: true
if: github.repository == 'scikit-learn/scikit-learn'
if: github.repository == 'neurodata/scikit-learn'
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
max-labels: "3"
Expand All @@ -27,7 +27,7 @@ jobs:
steps:
- uses: thomasjpfan/labeler@v2.5.1
continue-on-error: true
if: github.repository == 'scikit-learn/scikit-learn'
if: github.repository == 'neurodata/scikit-learn'
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
configuration-path: ".github/labeler-file-extensions.yml"
2 changes: 1 addition & 1 deletion .github/workflows/update_tracking_issue.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ on:
jobs:
update_tracking_issue:
runs-on: ubuntu-latest
if: github.repository == 'scikit-learn/scikit-learn' && github.event_name == 'schedule'
if: github.repository == 'neurodata/scikit-learn' && github.event_name == 'schedule'
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
Expand Down
33 changes: 5 additions & 28 deletions .github/workflows/wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ on:
- cron: "42 3 */1 * *"
push:
branches:
- main
- fork
# Release branches
- "[0-9]+.[0-9]+.X"
pull_request:
branches:
- main
- fork
- "[0-9]+.[0-9]+.X"
# Manual run
workflow_dispatch:
Expand All @@ -26,7 +26,7 @@ jobs:
check_build_trigger:
name: Check build trigger
runs-on: ubuntu-latest
if: github.repository == 'scikit-learn/scikit-learn'
if: github.repository == 'neurodata/scikit-learn'
outputs:
build: ${{ steps.check_build_trigger.outputs.build }}

Expand Down Expand Up @@ -190,31 +190,8 @@ jobs:
with:
path: dist/*.tar.gz

# Upload the wheels and the source distribution
upload_anaconda:
name: Upload to Anaconda
runs-on: ubuntu-latest
needs: [build_wheels, build_sdist]
# The artifacts cannot be uploaded on PRs
if: github.event_name != 'pull_request'

steps:
- name: Checkout scikit-learn
uses: actions/checkout@v3

- name: Download artifacts
uses: actions/download-artifact@v3
- uses: actions/upload-artifact@v3
with:
path: dist
name: ${{ matrix.python[0] }}-${{ matrix.os[1] }}

- name: Setup Python
uses: actions/setup-python@v4

- name: Upload artifacts
env:
# Secret variables need to be mapped to environment variables explicitly
SCIKIT_LEARN_NIGHTLY_UPLOAD_TOKEN: ${{ secrets.SCIKIT_LEARN_NIGHTLY_UPLOAD_TOKEN }}
SCIKIT_LEARN_STAGING_UPLOAD_TOKEN: ${{ secrets.SCIKIT_LEARN_STAGING_UPLOAD_TOKEN }}
ARTIFACTS_PATH: dist/artifact
# Force a replacement if the remote file already exists
run: bash build_tools/github/upload_anaconda.sh
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
build
sklearn/datasets/__config__.py
sklearn/**/*.html
scikit_learn_tree.egg-info/*

dist/
MANIFEST
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -64,4 +64,4 @@ code-analysis:
build_tools/linting.sh

build-dev:
pip install --verbose --no-build-isolation --editable .
pip install --verbose --no-build-isolation --editable .
Loading