This repository contains code to reproduce the findings of TYK2 mediates neuroinflammation in dementia with Alzheimer’s disease. It uses an updated version of the Drug Repurposing In Alzheimer's Disease (DRIAD) prediction pipeline, designed to identify candidates for drug repurposing in Alzheimer's disease.
- Primary DRIAD repository: https://github.com/labsyspharm/DRIAD
- DRIAD as a webapp: https://labsyspharm.shinyapps.io/DRIAD/
Fully reproducing the analysis requires access to the AMP-AD consortium dataset. Because of data sharing restrictions we do not make this dataset available in this repository directly. However, we include a small synthetic example dataset in the data
directory that can be used to test the pipeline.
For directly downloading the necessary datasets the non-CRAN synapser R package and a free Synapse user account are required. Installing all dependencies should take around 10min.
install.packages("synapser", repos=c("http://ran.synapse.org", "https://cloud.r-project.org"))
Additionally the following R packages are required:
install.packages(
c("tidyverse", "data.table", "powerjoin", "here", "qs", "remotes", "BiocManager")
)
remotes::install_github("ArtemSokolov/synExtra")
remotes::install_github("labsyspharm/DRIAD")
remote::install_github("labsyspharm/ordinalRidge")
BiocManager::install("tximport")
The pipeline requires two datasets:
- The AMP-AD consortium RNA-seq dataset containing RNA-seq data from Alzheimer's disease patients at different stages.
- Drug perturbation RNA-seq data for the compounds that are to be investigated for repurposing.
We use a subset of the ROSMAP dataset. It can be downloaded using the following steps:
# Compile a list of Synapse IDs. Creates 'selected_rosmap_samples.csv'
Rscript download/01_select_rosmap_samples.R
# Downloads the data
bash job_scripts/download_synapse.sh selected_rosmap_samples.csv 2
The drug perturbation gene signatures are located on Synapse at syn20820056 and syn20820060.
They are automatically downloaded in the prepare_rosmap_tasks.R
script.
This data was used to validate the quantification of TDP-43-loss induced cryptic exons (Liu EY et al. Loss of Nuclear TDP-43 Is Associated with Decondensation of LINE Retrotransposons. Cell Rep 2019). It is available from GEO at GSE126542.
bash job_scripts/GSE126542_sra_download.sh
RNA-seq samples from the ROSMAP dataset are quantified using Salmon. Some of the samples are only available as bam files, which first must be converted to fastq using job_scripts/bam_to_fastq.sh
.
Salmon quantification is performed using job_scripts/salmon.sh
using a custom index with additional transcripts containing TDP-43 pathology-induced cryptic exons. The index is created using the job_scripts/make_salmon_index.sh
script.
Once Salmon quantification is complete, the samples are combined into a single count matrix using driad/prep_rosmap_counts.R
.
DRIAD background gene set and drug gene set tasks are created using driad/prepare_rosmap_tasks.R
. Note that this step requires a HPC cluster with a job scheduler.
The results of the DRIAD pipeline can be analyzed using the driad/driad_plots.R
script. This script generates plots showing the performance of the drug repurposing candidates in TDP-43+ and TDP-43- patient subgroups.
A small synthetic example dataset is available at syn63663018. It was generated using download/synthetic_data.Rmd
. It can be used in lieu of the AMP-AD dataset patient sample RNA-seq data to test the pipeline.
We gratefully acknowledge support by NIA grant R01 AG058063, RF1 AG078297, and NCI U54 CA225088.