Skip to content

πŸ”βœ¨ A machine learning project that predicts income based on various demographic factors using Random Forest and Gradient Boosting algorithms. Includes data preprocessing, hyperparameter tuning, and model evaluation with detailed performance metrics. πŸ“ŠπŸ€–

Notifications You must be signed in to change notification settings

Armanx200/Income-Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌟 Income Predictor 🌟

Welcome to the Income Predictor repository! This project uses machine learning to predict income based on various features from the dataset. We have utilized Random Forest and Gradient Boosting algorithms to achieve this.

πŸš€ Overview

This project takes a dataset and preprocesses it to convert categorical data to numerical data. After splitting the data into training and testing sets, we train a Random Forest model and a Gradient Boosting model. The accuracy of the Random Forest model is 0.86.

πŸ“Š Results

Random Forest Accuracy: 0.86

Feature Importances

πŸ› οΈ How to Use

  1. Clone the repository:
    git clone https://github.com/Armanx200/Income-Predictor.git
  2. Navigate to the project directory:
    cd Income-Predictor
  3. Install the required libraries:
    pip install -r requirements.txt
  4. Run the predictor:
    python Income_Predictor.py

πŸ“ File Structure

  • Income_Predictor.py: Main script for training and evaluating the models.
  • adult.csv: The dataset used for training.
  • requirements.txt: List of required libraries for the project.
  • Figure.png: Plot showing feature importances of the model.

πŸ’‘ Features

  • Data Preprocessing: Handles missing values and encodes categorical variables.
  • Model Training: Trains both Random Forest and Gradient Boosting models.
  • Hyperparameter Tuning: Uses GridSearchCV for finding the best hyperparameters.
  • Model Evaluation: Provides accuracy, classification report, and confusion matrix.

πŸ€– Models Used

  • Random Forest Classifier
  • Gradient Boosting Classifier

πŸ“ˆ Performance Metrics

  • Random Forest:
    • Accuracy: 0.86
    • Detailed classification report and confusion matrix available in the output.
  • Gradient Boosting:
    • Try running the script to check the performance metrics.

πŸ”§ Future Enhancements

  • Add more models to compare.
  • Perform more extensive hyperparameter tuning.
  • Implement advanced feature engineering techniques.

πŸ“¬ Contact

For any questions or suggestions, feel free to reach out:


Made with ❀️ by Armanx200

About

πŸ”βœ¨ A machine learning project that predicts income based on various demographic factors using Random Forest and Gradient Boosting algorithms. Includes data preprocessing, hyperparameter tuning, and model evaluation with detailed performance metrics. πŸ“ŠπŸ€–

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages