Image adapted from https://doi.org/10.1038/nature01511
This module outlines the essential steps in the process of analyzing proteomics data and recommends commonly used tools and techniques for this purpose. It assumes a simple experimental design for differential abundance including two experimental conditions such as cancer vs normal. The training data provided utilized TMT10plex multiplex design with MS3 data acquisition. This notebook describes mass spectrometry and statistical terminology for data preprocessing, normalization, and differential abundance analysis. Note: This notebook uses simple base R plots. These can be modified to learn how to build better publication quality plots using R.
These notebooks are available for both AWS and Google Cloud. Follow the links to each subdirectory for cloud platform-specific information and Jupyter notebooks.
This tutorial was designed to be used on cloud computing platforms, with the aim of requiring nothing but the files within this github repository. The Jupyter Notebook file can run on Google Cloud Platform, Amazon Web Service, and Microsoft Azure provided the R packages are installed. The Notebook can be launched using NIH STRIDES training module and therefore requirements should only require access NIH STRIDES resources.
Text and materials are licensed under a Creative Commons CC-BY-NC-SA license. The license allows you to copy, remix and redistribute any of our publicly available materials, under the condition that you attribute the work (details in the license) and do not make profits from it. More information is available here.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
This tutorial will cost you just less than $1.00 assuming you stop using your computing resources at the end of the work.
Funded by National Resource for Quantitative Proteomics NIH/NIGHMS R24GM137786.