Welcome to the sdgp-ml repository!
This repository contains notebooks and resources related to the Software Development Group Project (SDGP) machine learning component. Specifically, it includes two notebooks used for creating a dataset and fine-tuning a Mistral-7B-v0.1-Instruct model. Additionally, the repository houses the dataset utilized in fine-tuning the model.
To run the fine-tuning notebook successfully, ensure that your machine meets the following requirements:
- At least 24 GB VRAM
- Latest NVIDIA drivers installed
- CUDA version 12.1 or higher
We've fine-tuned the Mistral-7B-v0.1-Instruct model using our dataset. You can access the fine-tuned model through this link.
Disclaimer: This project is solely for educational purposes and research within the Semantic Digital Governance Project.
To further explore related topics and resources, you may find the following links useful:
- Tuning-the-Finetuning
- Mistral Mastery: Fine-Tuning & Fast Inference Guide
- 4-bit Transformers with Hugging Face
- Transformers for Legal Language
- AutoAWQ
Feel free to explore these references and the code.