Repository containing the final project of the methods of data mining course of the EIT Digital data science master at Aalto University 📚.
The objective of this project is to test different clustering algorithms applied to two datasets:
genedata
: contains the codification of a genetic sequence.msdata
: contains the results of an mass spectrometry analysis.
The goodness of the clustering algorithm is tested using the normalized mutual information score (NMI).
The code for the different clustering test is performed in different python scripts:
src/genedata
: analysis done for the first dataset.src/msdata
: analysis done for the second dataset.
- Cristian Abrante - CristianAbrante
- José González - jgl2000