Introduction

windows Pe malware detection using Ensamble learning

Introduction

Malware detection is the process of ascertaining the presence of malware on a system or determining whether a program is malicious or harmless so that the system can be protected or recovered from any effects caused by the malicious code .
As the number of legitimate users of the Internet increases, so do the opportunities for cybercriminals to gain from manufacturing malware.
This is the reason that prompted the authors of the article we investigated to develop a model for predicting whether a PE file is malicious or benign by methods of deep learning and group learning.
We implemented the idea in the models and tried to slightly improve the results, which we did manage to do eventually.
We used the dataset of the research from Kaggle.
The data contains 19611 rows, and 79 columns

Dimensionality Reduction

Following the research work - we used PCA to reduce the number of columns to 55, as determined by the researchers.
Before that, we carried out our own research and found that in advance we could not refer to 4 columns ('Name', 'Machine’, 'TimeDateStamp', and the target label 'Malware’ that mustn’t be reduced) that represent general or not significant information so that it does not constitute an impact on the data

Malware Detection Using Machine Learning

We used 5 ML models to detect the malware PE files:
- Gaussian Naïve Bayes,
- Decision Tree,
- Random Forest,
- AdaBoost,
- Gradient Boosting

Deep learning models

- The next stage was implementing 3 DL models: 1. MLP with 1 hidden layer 2. MLP with 2 hidden layers 3. 1D CNN

Malware Detection Using Deep Learning Models and Ensemble Learning

In the last stage they implemented an ensemble learning model by implementing the previous 3 dl models as the first stage, and on top of these results – machine learning models were implemented as the final stage.

The results we reached

Metalearner	our	their
Decision Tree	0.99981	0.989
Random Forest	0.99981	0.9924
Extra Trees	0.99981	1
KNN	0.98266	0.979
LDA	0.97705	0.98
AdaBoost	0.97673	0.982
SVM	0.97654	0.982
Logistic	0.97642	0.981
SGD	0.97508	0.979
Passive	0.97444	0.978
Gaussian	0.97291	0.972
QDA	0.96577	0.973

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Final Project Presentation.pdf		Final Project Presentation.pdf
README.md		README.md
Windows PE Malware Detection Using Ensemble Learning.pdf		Windows PE Malware Detection Using Ensemble Learning.pdf
dataset_malwares.csv		dataset_malwares.csv
dataset_test.csv		dataset_test.csv
windows Pe malware detection using Ensamble learning.ipynb		windows Pe malware detection using Ensamble learning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

windows Pe malware detection using Ensamble learning

Introduction

Dimensionality Reduction

Malware Detection Using Machine Learning

Deep learning models

Malware Detection Using Deep Learning Models and Ensemble Learning

The results we reached

About

Releases

Packages

Contributors 2

Languages

ohadwolfman/Microsoft-PE-malware-detection

Folders and files

Latest commit

History

Repository files navigation

windows Pe malware detection using Ensamble learning

Introduction

Dimensionality Reduction

Malware Detection Using Machine Learning

Deep learning models

Malware Detection Using Deep Learning Models and Ensemble Learning

The results we reached

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages