Here are our exercises of implementing classification algorithms in Python using sci-kit learn.
It written in Python-3.6.7. Dependencies are available in requirements.txt
file. You may have to install tkinter. Follow this instruction:
$ # on the debian-based OS like Ubuntu
$ sudo apt-get install python3-tk
Image is from developers.google.com
You can see DOCUMENT.md
for more information.
To run this program without installing python3 and other libraries/dependencies, you can run our docker image.
$ docker pull ahmdrz/spam-classifier:latest
$ docker run ahmdrz/spam-classifier:latest
We used standard dataset named
spambase
. You can find it in dataset directory of our repository. This program support all ofarff
datasets that the class-label is in the last column.
- kNN
- Naive bayes
- Decision tree
- SVM
- Random forest
TODO: With neural-networks
The result contains the confusion matrix and the accuracy of each algorithm and will be available in the results directory.
Accuracy graph | Confusion matrix for kNN with k=6 |
---|---|
The configuration of each classifier listed below
- n_neighbors in kNN: 6
- C in SVC: 2.0
- n_estimators in RandomForest: 6
- all others were in the default configuration.
We used confusion_matrix_pretty_print.py to generate this figure.
kNN | SVM | Naive-Bayes | Random-Forest | Decision-Tree |
---|---|---|---|---|
- Nastaran Kiani (@Nastarankiani)
- Ahmadreza Zibaei (@ahmdrz)