Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IFU master 2024 01 24 #4

Merged
merged 171 commits into from
Jan 30, 2024
Merged

IFU master 2024 01 24 #4

merged 171 commits into from
Jan 30, 2024

Conversation

pnunna93
Copy link
Collaborator

@pnunna93 pnunna93 commented Jan 27, 2024

This PR pulls upstream changes for 0.42.0 version.

  • Resolved merge conflicts - conflicts_diff.txt
  • Updated hipified files for new kernels and ops from upstream
  • Fixed build errors and ran unittests, summary shown below
  • Skipped failing unit tests for future review. There are around 200 additional tests added in functional module, which contributed to major chunk of new failures.

Unit test summary:
PreIFU:

Module Passed Failed Skipped
autograd 1616 624 0
cuda_setup_evaluator 0 1 0
functional 270 23 43
linear8bitlt 9 9 0
modules 10 4 0
optim 125 26 26
triton 0 2 0
Total 2030 689 69

PostIFU:

Module Passed Failed Skipped
autograd 1592 648 0
cuda_setup_evaluator 0 1 0
functional 313 233 54
linear8bitlt 9 9 0
modules 14 4 0
optim 124 27 26
triton 0 2 0
generation 8 8 0
linear4bit 32 0 0
Total 2092 932 80

rapsealk and others added 30 commits April 25, 2023 17:00
Changed misleading Hardware requirements from "2018 or older" to "2018 or newer"
Added scipy to requirements.txt as it is used but not added to requirements
…yer-device

Add `device` parameter to `Linear` subclasses and `Embedding`
TimDettmers and others added 24 commits January 1, 2024 18:07
Add version attribute as per Python convention
…_permissionerror_order

Make sure bitsandbytes handles permission errors in the right order
fix array index out of bounds in kgetColRowStats
…sbelkada-delete-workflow

Delete .github/workflows/delete_doc_commment.yml
This PR adds initial FSDP support for training QLoRA models. It enables basic FSDP and CPU Offload support, with low memory training via FSDP.sync_module_states option unsupported.

This PR builds off of bitsandbytes-foundation#840 commit 8278fca and BNB FSDP by @TimDettmers and @Titus-von-Koeller.

An example of using this PR to finetune QLoRA models with FSDP can be found in the demo repo: AnswerDotAi/fsdp_qlora.

* Minimal changes for fp32 4bit storage from BNB commit 8278fca

* Params4bit with selectable storage dtype

* possible fix for double quantizing linear weight & quant storage dtype

* minor fixes in Params4bit for peft tests

* remove redundant

* add float16

* update test

* Remove float16 quant cast as there are fp32, bf16, & fp16 quant kernels

---------

Co-authored-by: Kerem Turgutlu <keremturgutlu@gmail.com>
…ytes-foundation#703), Sort compute capabilities sets to select max

* Add support for CUDA 12.1

* Update README to include CUDA 12.1 version

* Add support for >= 12x

Co-authored-by: Jeongseok Kang <jskang@lablup.com>

* Temporary version of bitsandbytes PR 527: Sort compute capabilities sets to select max

* Modify PR 506 to support C++20

* Add Cuda 12.2

---------

Co-authored-by: PriNova <info@prinova.de>
Co-authored-by: PriNova <31413214+PriNova@users.noreply.github.com>
Co-authored-by: Jeongseok Kang <jskang@lablup.com>
* Added install requirements to setup

* Update setup.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
* implicitly skip any test that implicitly uses CUDA on a non-CUDA box
* add a `requires_cuda` fixture
Copy link
Collaborator

@amathews-amd amathews-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Lzy17 , please review.

@amathews-amd amathews-amd merged commit 48b7fa9 into rocm_enabled Jan 30, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.