Although microarrays have been superseded by high-throughput sequencing technologies for gene expression profiling, years of experience gained from analysing microarray data has led to a variety of analysis techniques and datasets that can be exploited in other contexts. In this course, we will focus on retrieving and exploring microarray data from public repositories such as Gene Expression Omnibus (GEO).
- Exploratory data analysis techniques for high-throughput data
- Workflows for the analysis of Illumina and Affymetrix gene expression data
- Normalisation of gene expression data
- Differential expression (DE) analysis using linear-modelling techniques
- Importing data from GEO into R
- Principal Components Analysis and hierarchical clustering of gene expression data
- Import gene expression datasets from GEO into R
- Assess the quality of a dataset in a repository
- Identify, and correct for, batch effects
- Perform a standard DE analysis to get a ranked list of genes
- Use un-supervised methods to explore a dataset
- Interrogate particular genes of interest
- A very basic knowledge of UNIX would be an advantage, but nothing will be assumed and extremely little will be required
- Attendees should be comfortable with using the R statistical language to read and manipulate data, and produce simple graphs