Prediction of Interference with Specific Assay - Technologies is a project aimed at developing and evaluating machine learning models for predicting assay interference based on statistically derived labels from ultra-large bioactivity data matrices. This repository contains scripts for data preprocessing, model training, validation, and testing.
-
Clone the repository:
git clone https://github.com/Bayer-Group/PISA-T.git
-
Install dependencies using Conda:
conda env create -f environment.yml
- Place your raw data files in the
data/raw/
directory. | Note: data already available in the directory are randomly generated and serve as example to run the pipeline. - Run the data preprocessing scripts in the
preprocessing/
directory to clean and preprocess the data.
- Use scripts in the
dense_network/
andrandom_forest/
directories to train different models: - Modify hyperparameters and configurations as needed.
- Validate the models using validation scripts provided in the respective directories.
- Tune hyperparameters for optimal performance.
- Test the trained models on test data using testing scripts.
- Evaluate model performance and generate results.
- Vincenzo Palmacci (@vincenzo-palmacci)
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.