This is the final project for the Multiple Classifiers System's class.
The goal of this project is to implement the proposed method of the paper "A two-stage ensemble method for the detection of class-label noise" and reproduce the experiments in some of the paper datasets, which can be found in the UCI Machine Learning repository. The experimented datasets were: Blood, Breast, Chess, Heart, Ionosfere, Liver, Parkinsons, Sonar and Spambase.
- Python >= 3.7.1
- NumPy >= 1.15.4
- SciPy >= 1.1.0
- pandas >= 0.23.4
- scikit-learn >= 0.20.1
- matplotlib >= 3.0.1
- Clone this repository into your machine
- Download and install all the requirements listed above in the given order
- Download the listed datasets in .txt format
- Place all these files inside the data/ folder
- Change their file types to .csv
- Change their filenames according to the names in the ConfigHelper function get_datasets
- Enter into the project main folder in your local repository
- Run this first command to generate all metrics
python main.py
- Run this second command to aggregate these metrics and generate the final error table and graphics
python aggregate.py
.
├── data # Datasets files
├── results # Results files
├── src # Source code files
| ├── aggregate.py
| ├── main.py
| ├── majority_filtering.py
| ├── noise_detection_ensemble.py
| ├── config_helper.py
| ├── data_helper.py
| ├── io_helper.py
| └── metrics_helper.py
├── LICENSE.md
└── README.md
This project is licensed under the MIT License - see the LICENSE.md file for details.