Skip to content

Data Profiler is a Streamlit app designed to provide insightful data analysis and visualization. Users can upload their datasets in '.csv' or '.xlsx' format, and the app generates a comprehensive profiling report using the YData Profiling library.

License

Notifications You must be signed in to change notification settings

srinibas-masanta/Streamlit-Dataprofile

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Profiler

Data Profiler is a powerful and user-friendly web application built with Streamlit that allows you to analyze and visualize your datasets with ease. Simply upload your data in .csv or .xlsx format, and generate comprehensive profiling reports that help you detect anomalies, patterns, and trends within your data.

Features

  • Automated Data Analysis: Quickly generate detailed profiling reports by uploading your dataset.
  • Customizable Reports: Choose between different display modes, including Primary, Dark, and Orange.
  • Support for Multiple Formats: Upload .csv or .xlsx files (up to 10 MB) for analysis.
  • Interactive UI: Easy-to-use interface with options to select specific sheets for .xlsx files.
  • Downloadable Reports: Save the profiling report as an HTML file for offline analysis.

Installation

To run the Data Profiler application on your local machine, follow the steps below:

1. Clone the Repository

git clone https://github.com/srinibas-masanta/data-profiler.git
cd data-profiler

2. Set Up a Virtual Environment

Create and activate a virtual environment to manage dependencies.

python -m venv dataprofile
.\dataprofile\Scripts\activate  # On Windows
source dataprofile/bin/activate  # On macOS/Linux

3. Install Dependencies

Install the required Python packages listed in the requirements.txt file.

pip install -r requirements.txt

Alternatively, manually install the necessary packages:

pip install numpy pandas scipy matplotlib streamlit ydata-profiling streamlit-pandas-profiling openpyxl xlrd

4. Run the Application

Start the Streamlit application by running the following command:

streamlit run app.py

Usage

Once the application is running, follow these steps:

  1. Upload Your Data: Use the sidebar to upload a .csv or .xlsx file (up to 10 MB).
  2. Select Options: Choose the report mode (Primary, Dark, Orange), and decide if you want a minimal report or a full report.
  3. Generate Report: Click to generate the report, which will be displayed within the app.
  4. Download Report (Optional): If desired, save the report as an HTML file using the download button.

Project Structure

  • app.py: Main script containing the Streamlit application logic.
  • media/DP Logo.jpg: Logo used in the welcome page of the application.
  • requirements.txt: List of all the Python dependencies required to run the application.

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

About

Data Profiler is a Streamlit app designed to provide insightful data analysis and visualization. Users can upload their datasets in '.csv' or '.xlsx' format, and the app generates a comprehensive profiling report using the YData Profiling library.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages