Skip to content

Latest commit

 

History

History
37 lines (30 loc) · 1.81 KB

README.md

File metadata and controls

37 lines (30 loc) · 1.81 KB

Data Quality Analyzer

This program aim is to show how it is possible to analyze the quality of a provided datasets and make some preliminary analysis and comparisons on chunks of data, in order to prepare the data for data mining tasks.

We were provided with an almost perfect dataset (no missing values, no duplicated rows etc.) so we had to dirty it a little bit before feeding it to the web application for the analysis.

See:

Usage

To see how we dirtyied our datasets and how we created the quality attributes, run this notebook:

jupyter notebook Orginal-Data/File\ Conversion.ipynb 

To start the webapp that contains the Data Quality Analyzer, simply run:

python webapp.py

Note: This command will also start a flask server on port 5000. To access it, open on your browser the page:

http://localhost:5000/query

Built with


Students Giacomo Astolfi, Leonardo Febbo. Project for the course on 'Data and Information Quality' held at Politecnico di Milano by Prof. Cinzia Cappiello, A.Y. 2018/2019