Skip to content

Latest commit

 

History

History
27 lines (21 loc) · 1.33 KB

EMAIL DETECTION.md

File metadata and controls

27 lines (21 loc) · 1.33 KB

Email-Spam-Detection

(In this repository we will learn how to detect an email as "spam" or "ham", using ML algorithms.)

This is a simple email spam detection project using Python's machine learning library, Scikit-Learn. The project uses the Logistic Regression algorithm to classify emails into spam or not spam.

Dataset

The dataset used for this project is the spam.csv file, which contains 5572 emails with two columns, Category and Message. The Category column denotes whether an email is spam or not, and the Message column contains the text of the email.

Dependencies

NumPy Pandas Scikit-Learn Installation To run this project, you need to have Python installed on your machine along with the above dependencies.

Usage

Clone the repository Install the dependencies Run the script email_spam_detection.py

Algorithm

The algorithm used in this project is Logistic Regression. It is a classification algorithm used to assign observations to a discrete set of classes. In this case, the algorithm classifies emails as spam or not spam based on the text of the email.

How to improve the project

  • Try different machine learning algorithms like Naive Bayes, SVM, or Decision Tree
  • Preprocess the data by removing stop words, stemming or lemmatizing the text
  • Use different feature extraction techniques like Bag of Words, TF-IDF, or Word2Vec