This repository hosts machine learning code and discussion (see Issues) for Project Cognoma.
The production notebook that is served to website users can be found in the ml-workers repository. This repository will be used for continued data exploration and new modeling approaches.
The following notebooks implement the primary machine learning workflow for Cognoma:
1.download.ipynb
: downloads the cancer datasets.2.mutation-classifier.ipynb
: builds a classifier for mutation in a given gene.3.pathway-classifier.ipynb
: builds a classifier for mutation in any gene for a given pathway.
If you've modified a notebook and are submitting a pull request, then export the notebooks to scripts:
jupyter nbconvert --to=script --FilesWriter.build_directory=scripts *.ipynb
This repository uses conda to manage its environment and install packages.
If you don't have conda installed on your system, you can download it here.
You can install the Python 2 or 3 version of Miniconda (or Anaconda), which determines the Python version of your root environment.
Since we create a dedicated environment for this project, named cognoma-machine-learning
whose explicit dependencies are specified in environment.yml
, the version of your root environment will not be relevant.
With conda, you can create the cognoma-machine-learning
environment by running the following from the root directory of this repository:
# Create or overwrite the cognoma-machine-learning conda environment
conda env create --file environment.yml
If environment.yml
has changed since you created the environment, run the following update command:
conda env update --file environment.yml
Activate the environment by running source activate cognoma-machine-learning
on Linux or OS X and activate cognoma-machine-learning
on Windows.
Once this environment is active in a terminal, run jupyter notebook
to start a notebook server.