PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

This repository provides training scripts for fine-tuning LLMs using our preference datasets as described in the paper.

Dataset

The datasets are hosted on Hugging Face Hub. There are two preference datasets:

Repository Structure

configs/ - Contains the training recipes for LLMs.
data/ - Contains the dataset wrappers.
finetune/ - Contains the fine-tuning script.

Dependencies

The codebase depends on torchtune and huggingface.

Fine-Tuning the Model

To fine-tune the model, follow these steps:

Download the model weights using torchtune's tune download.
Ensure the configuration file under configs/ is correctly pointing to the downloaded model.

Run the fine-tuning process using torchtune:

tune run finetune/dpo_finetune.py --config configs/<config file> checkpointer.output_dir=<path to save the fine-tuned model> output_dir=<path to save the outputs and logs> dataset._component_=<data.datasets.politune_left|data.datasets.politune_right>

For example:

tune run finetune/dpo_finetune.py --config configs/llama8b_lora_dpo_single_device.yaml checkpointer.output_dir=checkpoints/ output_dir=output/ dataset._component_=data.datasets.politune_left

Citation

If you use this codebase or the datasets in your work, please cite our paper:

@inproceedings{agiza2024politune,
  title={PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models},
  author={Agiza, Ahmed and Mostagir, Mohamed and Reda, Sherief},
  booktitle={Proceedings of the 2024 AAAI/ACM Conference on AI, Ethics, and Society},
  pages={},
  year={2024}
}

License

MIT License. See LICENSE file

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
data		data
finetune		finetune
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

Dataset

Repository Structure

Dependencies

Fine-Tuning the Model

Citation

License

About

Releases

Packages

Languages

License

scale-lab/PoliTune

Folders and files

Latest commit

History

Repository files navigation

PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

Dataset

Repository Structure

Dependencies

Fine-Tuning the Model

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages