Skip to content

Commit

Permalink
Add Password Strength Checker script with ML and NN (#349)
Browse files Browse the repository at this point in the history
* Add password strengtch checker source code

* Update README.md

* Fix README.md

* Fix typo in model README.md
  • Loading branch information
iHaz32 authored Oct 21, 2024
1 parent 88ee0ad commit 477f102
Show file tree
Hide file tree
Showing 11 changed files with 262 additions and 0 deletions.
27 changes: 27 additions & 0 deletions Password Strength Checker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Password Strength Checker

## Description
A password strength checker that utilizes machine learning to classify the strength of passwords. This project provides a simple interface for users to input their passwords and receive feedback on their strength based on various criteria.

## Features
- Classifies password strength into multiple categories.

## Installation
1. Clone the repository:
```bash
git clone https://github.com/DhanushNehru/Python-Scripts
cd "Password Strength Checker"

2. Create and activate a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
3. Install the required packages:
```bash
pip install -r requirements.txt

## Usage
To run the password strength checker:
```bash
python main.py
8 changes: 8 additions & 0 deletions Password Strength Checker/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from model.model import predict # import model

def main():
password_to_test = input("Enter a password to check its strength: ") # get password from terminal
predicted_class = int(predict(password_to_test)) # evaluate password strength
print(f"Password strength classification: {predicted_class} / 2") # output 0 - weak, 1 - moderate, or 2 - strong

if __name__ == "__main__": main()
15 changes: 15 additions & 0 deletions Password Strength Checker/model/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Password Strength Classification Model

## Overview
This model is designed to evaluate the strength of passwords using machine learning techniques. It analyzes input passwords and classifies them based on their strength, providing feedback for users to create stronger passwords.

## Model Architecture
- **Input Layer**: The model accepts passwords as input.
- **Dense Layers**: A series of dense layers with activation functions (e.g., ReLU) process the input features.
- **Output Layer**: The final layer outputs a classification score indicating password strength (e.g., weak - 0, medium - 1, strong - 2).

## Training
- The model is trained on a labeled dataset of passwords classified by strength.

## Future improvements
- In feature engineering, columns about the amount of common used passwords (etc. 'password') or common used words should be added and be taken into consideration properly in model training.
Binary file not shown.
68 changes: 68 additions & 0 deletions Password Strength Checker/model/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# disable debugging messages
def warn(*args, **kwargs):
pass
import warnings
warnings.warn = warn
warnings.filterwarnings("ignore", category=DeprecationWarning)
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
from silence_tensorflow import silence_tensorflow
silence_tensorflow("WARNING")

import pandas as pd
import pickle

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from model.utils.functions import calculate_entropy, count_repeats, count_sequential
from model.utils.preprocessing import run_preprocessing
from model.utils.training import run_training

# run preprocessing and training
# run_preprocessing() # uncomment to run preprocessing
# run_training() # uncomment to train the model

def prepare_input(password): # function to prepare input features from password
# create a dataframe for a single input
data = {
'length': [len(password)], # calculate password length
'lowercase_count': [sum(c.islower() for c in password)], # count lowercase characters
'uppercase_count': [sum(c.isupper() for c in password)], # count uppercase characters
'digit_count': [sum(c.isdigit() for c in password)], # count digits
'special_count': [sum(not c.isalnum() for c in password)], # count special characters
'entropy': [calculate_entropy(password)], # calculate entropy
'repetitive_count': [count_repeats(password)], # count repetitive characters
'sequential_count': [count_sequential(password)] # count sequential characters
}

with open('model/scaler.pkl', 'rb') as file: # load the fitted scaler from file
scaler = pickle.load(file)

# convert to dataframe
input_df = pd.DataFrame(data)

# normalize using the previously fitted scaler
normalized_input = scaler.transform(input_df)

return pd.DataFrame(normalized_input, columns=input_df.columns) # return normalized input as dataframe

def predict(password): # function to predict password strength
# load the model
model = Sequential() # create a sequential model
model.add(Dense(128, activation='relu', input_shape=(8,))) # add input layer with 128 neurons
model.add(Dense(64, activation='relu')) # add hidden layer with 64 neurons
model.add(Dense(3, activation='softmax')) # add output layer with softmax activation

# load trained weights
model.load_weights('model/deep_learning_model.h5') # load weights from the trained model file

# prepare the input
password_to_test = password # assign password to test
input_features = prepare_input(password_to_test) # prepare input features

# make the prediction
prediction = model.predict(input_features, verbose=0) # predict using the model
predicted_class = prediction.argmax(axis=-1) # get the predicted class index

return predicted_class # return the predicted class
Binary file added Password Strength Checker/model/scaler.pkl
Binary file not shown.
18 changes: 18 additions & 0 deletions Password Strength Checker/model/utils/functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import numpy as np

def calculate_entropy(password): # function to calculate the entropy of a password
if len(password) == 0: # check if the password is empty
return 0 # return 0 for empty passwords
char_counts = np.array(list(password)) # convert password to a numpy array
unique, counts = np.unique(char_counts, return_counts=True) # get unique characters and their counts
probabilities = counts / len(password) # calculate the probability of each character
entropy = -np.sum(probabilities * np.log2(probabilities)) # compute the entropy using the probabilities
return entropy # return the calculated entropy

def count_repeats(password): # function to count consecutive repeated characters in the password
return sum(password[i] == password[i + 1] for i in range(len(password) - 1)) # sum the repeated characters

def count_sequential(password): # function to count sequential characters in the password
sequences = [''.join(chr(i) for i in range(start, start + 3)) for start in range(ord('a'), ord('z') - 1)] # generate sequences of 3 lowercase letters
sequences += [''.join(str(i) for i in range(start, start + 3)) for start in range(10)] # generate sequences of 3 digits
return sum(1 for seq in sequences if seq in password) # count how many of the sequences are in the password
32 changes: 32 additions & 0 deletions Password Strength Checker/model/utils/preprocessing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import pandas as pd
import pickle

from model.utils.functions import calculate_entropy, count_repeats, count_sequential
from sklearn.preprocessing import StandardScaler

def run_preprocessing():
# import data
dataframe = pd.read_csv('model/passwords.csv', on_bad_lines='skip') # read csv data file
dataframe = dataframe.dropna() # remove rows with empty values
dataframe = dataframe.drop_duplicates(subset='password') # remove duplicates

# add new columns
dataframe['length'] = dataframe['password'].str.len() # column for password length
dataframe['lowercase_count'] = dataframe['password'].apply(lambda x: sum(c.islower() for c in x)) # column for amount of lowercase characters
dataframe['uppercase_count'] = dataframe['password'].apply(lambda x: sum(c.isupper() for c in x)) # column for amount of uppercase characters
dataframe['digit_count'] = dataframe['password'].apply(lambda x: sum(c.isdigit() for c in x)) # column for amount of digits
dataframe['special_count'] = dataframe['password'].apply(lambda x: sum(not c.isalnum() for c in x)) # column for amount of special characters
dataframe['entropy'] = dataframe['password'].apply(calculate_entropy) # column for entropy
dataframe['repetitive_count'] = dataframe['password'].apply(count_repeats) # column for amount of repetitive characters
dataframe['sequential_count'] = dataframe['password'].apply(count_sequential) # column for amount of sequential characters

scaler = StandardScaler() # use standard scaler because there is a gaussian distribution in passwords.csv
numerical_features = ['length', 'lowercase_count', 'uppercase_count', 'digit_count', 'special_count', 'entropy', 'repetitive_count', 'sequential_count']
dataframe[numerical_features] = scaler.fit_transform(dataframe[numerical_features])

# save scaler model for future use
with open('model/scaler.pkl', 'wb') as file:
pickle.dump(scaler, file)

# save preprocessed data
dataframe.to_csv('model/output.csv', index=False, header=True)
47 changes: 47 additions & 0 deletions Password Strength Checker/model/utils/training.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# disable debugging messages
def warn(*args, **kwargs):
pass
import warnings
warnings.warn = warn
warnings.filterwarnings("ignore", category=DeprecationWarning)
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
from silence_tensorflow import silence_tensorflow
silence_tensorflow("WARNING")

import pandas as pd

from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical


def run_training(): # function to run the training process
dataframe = pd.read_csv('model/output.csv') # load the processed data from output.csv

# split the data into features and target variable
X = dataframe[['length', 'lowercase_count', 'uppercase_count', 'digit_count', 'special_count', 'entropy', 'repetitive_count', 'sequential_count']] # feature columns
y = dataframe['strength'] # target variable

# convert target variable to categorical
y = to_categorical(y) # convert labels to categorical format for multi-class classification

# split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # 80-20 split

# initialize the model
model = Sequential() # create a sequential model
model.add(Dense(128, activation='relu', input_shape=(X_train.shape[1],))) # add input layer with 128 neurons
model.add(Dense(64, activation='relu')) # add hidden layer with 64 neurons
model.add(Dense(y.shape[1], activation='softmax')) # add output layer with softmax activation

# compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # compile the model with adam optimizer

# train the model
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.2) # fit the model on training data

# save the model to a file
model.save('model/deep_learning_model.h5') # save the trained model
46 changes: 46 additions & 0 deletions Password Strength Checker/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
absl-py==2.1.0
astunparse==1.6.3
certifi==2024.8.30
charset-normalizer==3.4.0
flatbuffers==24.3.25
gast==0.6.0
google-pasta==0.2.0
grpcio==1.67.0
h5py==3.12.1
idna==3.10
joblib==1.4.2
keras==3.6.0
libclang==18.1.1
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.1
mdurl==0.1.2
ml-dtypes==0.4.1
namex==0.0.8
numpy==1.26.4
opt_einsum==3.4.0
optree==0.13.0
packaging==24.1
pandas==2.2.3
protobuf==4.25.5
Pygments==2.18.0
python-dateutil==2.9.0.post0
pytz==2024.2
requests==2.32.3
rich==13.9.2
scikit-learn==1.5.2
scipy==1.14.1
setuptools==75.2.0
silence_tensorflow==1.2.2
six==1.16.0
tensorboard==2.17.1
tensorboard-data-server==0.7.2
tensorflow-cpu==2.17.0
termcolor==2.5.0
threadpoolctl==3.5.0
typing_extensions==4.12.2
tzdata==2024.2
urllib3==2.2.3
Werkzeug==3.0.4
wheel==0.44.0
wrapt==1.16.0
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ More information on contributing and the general code of conduct for discussion
| OTP Verification | [OTP Verification](https://github.com/DhanushNehru/Python-Scripts/tree/master/OTP%20%20Verify) | An OTP Verification Checker. |
| Password Generator | [Password Generator](https://github.com/DhanushNehru/Python-Scripts/tree/master/Password%20Generator) | Generates a random password. |
| Password Manager | [Password Manager](https://github.com/nem5345/Python-Scripts/tree/master/Password%20Manager) | Generate and interact with a password manager. |
| Password Strength Checker | [Password Strength Checker](https://github.com/nem5345/Python-Scripts/tree/master/Password%20Strength%20Checker) | Evaluates how strong a given password is. |
| PDF Merger | [PDF Merger](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20Merger) |Merges multiple PDF files into a single PDF, with options for output location and custom order.|
| PDF to Audio | [PDF to Audio](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20to%20Audio) | Converts PDF to audio. |
| PDF to Text | [PDF to text](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20to%20text) | Converts PDF to text. |
Expand Down

0 comments on commit 477f102

Please sign in to comment.