Skip to content

The main goal of this project is to construct a workflow for the prediction of gene expression levels in Stretococcus thermophilus from promoter sequences.

Notifications You must be signed in to change notification settings

camilababo/Application-of-ML-approaches-towards-the-prediction-of-gene-expression-levels-in-S.thermophilus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 

Repository files navigation

alt text

Application of machine learning approaches towards the prediction of gene expression levels in Streptococcus thermophilus

This project was developed to practice the implementation of computational tools and consolidate the knowledge built in curricular units integrated into the Master's Degree of Bioinformatics at the University of Minho under orientation from Martin Rau with affiliation from Discovery, Chr. Hansen A/S, Hørsholm, Denmark.

The main goal of this project is to construct a workflow for the prediction of gene expression levels in Streptococcus thermophilus from promoter sequences and analyze the prediction accuracy of different Machine Learning approaches. The chosen algorithms for this project were: Extreme Gradient Boosting (XGBoost), Random Forest Regressor, Support Vector Regression and Linear Regression.

The dataset used was extracted from an original article (https://www.frontiersin.org/articles/10.3389/fmicb.2018.00445/full) and the full genome files from the corresponding Streptococcus thermophilus strain (ASCC 1275) were downloaded from the NCBI database (Accession Number: CP006819.1).

About

The main goal of this project is to construct a workflow for the prediction of gene expression levels in Stretococcus thermophilus from promoter sequences.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published