From 966da08f2bd8f759fe8304f3b53e900b65afd780 Mon Sep 17 00:00:00 2001 From: mumichae <51025211+mumichae@users.noreply.github.com> Date: Mon, 3 Aug 2020 05:46:04 +1000 Subject: [PATCH] Documentation update (#102) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * README update to include Baylor counts * updated drop installation command * update install command docs to include conda-forge plus better descriptions Co-authored-by: Vicente Co-authored-by: Michaela Müller Co-authored-by: Christian Mertes --- README.md | 31 ++++------ docs/source/index.rst | 23 +++++--- docs/source/installation.rst | 109 +++++++++++++++++++++-------------- docs/source/pipeline.rst | 7 ++- docs/source/prepare.rst | 6 +- 5 files changed, 99 insertions(+), 77 deletions(-) diff --git a/README.md b/README.md index 7650a6fe..aa1268ef 100644 --- a/README.md +++ b/README.md @@ -8,40 +8,29 @@ The manuscript main file, supplementary figures and table can be found in the ma drop logo -## Installation -DROP is available on [bioconda](https://anaconda.org/bioconda/drop) for python 3.6 and above. -We recommend using a dedicated conda environment. - +## Quickstart +DROP is available on [bioconda](https://anaconda.org/bioconda/drop). +We recommend using a dedicated conda environment. (installation time: ~ 10min) ``` -# create environment -conda create -n drop_env python=3.6 -conda activate drop_env - -# install drop -conda install -c bioconda drop +conda install -c conda-forge -c bioconda drop ``` -Installation time: ~ 10min - -Test whether the pipeline runs through by setting up the demo dataset in an empty directory (e.g. ``~/drop_demo``). +Test installation with demo project ``` mkdir ~/drop_demo cd ~/drop_demo - -# demo will download the necessary data and pipeline files drop demo ``` -The pipeline can be run using `snakemake` commands - +The pipeline can be run using [snakemake](https://snakemake.readthedocs.io/) commands ``` snakemake -n # dryrun -snakemake +snakemake --cores 1 ``` Expected runtime: 25 min -For more information on different installation options, check out the +For more information on different installation options, refer to the [documentation](https://gagneurlab-drop.readthedocs.io/en/latest/installation.html) ## Set up a custom project @@ -66,4 +55,8 @@ The following publicly-available datasets of gene counts can be used as controls * 119 non-strand specific fibroblasts: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3887451.svg)](https://doi.org/10.5281/zenodo.3887451) +* 139 strand specific fibroblasts: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3963474.svg)](https://doi.org/10.5281/zenodo.3963474) + +* 125 strand specific blood: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3963470.svg)](https://doi.org/10.5281/zenodo.3963470) + If you want to contribute with your own count matrices, please contact us: yepez at in.tum.de diff --git a/docs/source/index.rst b/docs/source/index.rst index 76b4c246..1eca1288 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -3,8 +3,8 @@ DROP - Detection of RNA Outliers Pipeline DROP is intended to help researchers use RNA-Seq data in order to detect genes with aberrant expression, aberrant splicing and mono-allelic expression. It consists of three independent modules for each of those strategies. -After installing DROP, the user needs to fill in the config file and sample annotation table (:ref:`prepare`). -Then, DROP can be executed in multiple ways (:ref:`pipeline`). +After installing DROP, the user needs to fill in the config file and sample annotation table (:doc:`prepare`). +Then, DROP can be executed in multiple ways (:doc:`pipeline`). .. toctree:: :maxdepth: 2 @@ -19,23 +19,28 @@ Then, DROP can be executed in multiple ways (:ref:`pipeline`). Quickstart ----------- -DROP is available on `bioconda `_ for python 3.6 and above. -We recommend using a dedicated conda environment. +DROP is available on `bioconda `_. +We recommend using a dedicated conda environment. (installation time: ~ 10min) .. code-block:: bash - conda install -c bioconda drop + conda install -c conda-forge -c bioconda drop -Initialize project +Test installation with demo project .. code-block:: bash - cd + mkdir ~/drop_demo + cd ~/drop_demo drop demo -Call the pipeline +The pipeline can be run using `snakemake `_ commands .. code-block:: bash - snakemake + snakemake -n # dryrun + snakemake --cores 1 +Expected runtime: 25 min + +For more information on different installation options, refer to :doc:`installation`. diff --git a/docs/source/installation.rst b/docs/source/installation.rst index e7b8c653..80742309 100644 --- a/docs/source/installation.rst +++ b/docs/source/installation.rst @@ -1,17 +1,18 @@ Installation ============ -DROP is available on `bioconda `_ for python 3.6 and above. -We recommend using a dedicated conda environment. +DROP is available on `bioconda `_ . +In case the conda channel priority is set to ``strict``, it should be reset to ``flexible``: -.. code-block:: bash +.. code-block:: + + conda config --set channel_priority true - # create environment - conda create -n drop_env python=3.6 - conda activate drop_env +We recommend using a dedicated conda environment (here: ``drop_env``) for installing drop. + +.. code-block:: bash - # install drop - conda install -c bioconda drop + conda create -n drop_env -c conda-forge -c bioconda drop Installation time: ~ 10min @@ -25,12 +26,12 @@ Test whether the pipeline runs through by setting up the demo dataset in an empt # demo will download the necessary data and pipeline files drop demo -The pipeline can be run using ``snakemake`` commands +The pipeline can be run using `snakemake `_ commands .. code-block:: bash snakemake -n # dryrun - snakemake + snakemake --cores 1 Initialize a project -------------------- @@ -39,12 +40,13 @@ Alternatively, a new DROP project can be set up using ``drop init``. .. code-block:: bash - cd + cd drop init This will create an empty ``config.yaml`` file that needs to be filled according to the project data. You also need to prepare a sample annotation file. -Go to :ref:`prepare` for more details. +Go to :doc:`prepare` for more details. + .. _otherversions: @@ -53,62 +55,83 @@ Other DROP versions The developer version of DROP can be found in the `repository `_ under the branch ``dev``. -Make sure that the :any:`dependencies` are installed. +Make sure that the :any:`prerequisites` are installed, preferably in a conda environment. +Then install DROP from github using ``pip``. .. code-block:: bash - # activate your python environment if you are using one, e.g. drop_env - conda activate drop_env + pip install git+https://github.com/gagneurlab/drop.git@dev + -Then install DROP from github using ``pip``. -For this recursively clone the repository with all its submodules and then install from directory. +Alternatively, you can clone the desired branch of the repository and install from directory. .. code-block:: bash - git clone -b dev https://github.com/gagneurlab/drop.git --recurse-submodules + git clone -b dev https://github.com/gagneurlab/drop.git pip install ./drop -Alternatively, you can also install it directly without cloning +If the package needs to be updated frequently, it is more useful to use the ``-e` option of ``pip``. +Any new update pulled from the repository will be available without reinstall. +Note, that this requires an explicit call to update any existing project (:any:`dropUpdate`). -.. code-block:: bash +.. code-block:: - pip install git+https://github.com/gagneurlab/drop.git@dev + pip install -e ./drop -.. _dependencies: + # update project directory + cd + drop update -Dependencies ------------- -The easiest way to ensure that all dependencies are installed is to install the -`bioconda package `_ into a conda environment. -.. code-block:: bash +.. _prerequisites: - conda install -c bioconda drop +Prerequisites +------------- -Other versions of drop can be installed after the bioconda package has been installed. +The easiest way to ensure that all dependencies are installed is to install the bioconda package, as described above. +Once the environment is set up and installation was successful, other versions of drop can be installed with ``pip``, +overwriting the conda version of ``DROP`` (see :any:`otherversions`). Installation without conda ++++++++++++++++++++++++++ Alternatively, DROP can be installed without ``conda``. In this case the following dependencies must be met: -* python >= 3.6 - * pip >= 19.1 -* `samtools `_ >= 1.7 -* `bcftools `_ >= 1.7 -* `tabix `_ -* `GATK `_ -* `graphviz `_ -* `pandoc `_ -* `R `_ >= 3.5 and corresponding `bioconductor `_ version - -If you are using an already existing R installation, make sure that the R and ``bioconductor`` versions match. -Otherwise, use the newest versions of R and bioconductor. -The necessary R packages will be installed with the first pipeline call. +* Programming languages: + + * `python `_ >= 3.6 and `pip `_ >= 19.1 + + * `R `_ >= 3.6 and corresponding `bioconductor `_ version + +* Commandline tools: + + * `GNU bc `_ + + * `GNU wget `_ + + * `tabix `_ + + * `samtools `_ >= 1.7 + + * `bcftools `_ >= 1.7 + + * `GATK `_ >= 4.0.4 + + * `graphviz `_ + + * `pandoc `_ + + +.. note:: + + If you are using an already existing R installation, make sure that the R and bioconductor versions match. + Otherwise, use the newest versions of R and bioconductor. + +At first invocation, all necessary R packages will be installed with the first pipeline call. As this is a lengthy process, it might be desirable to install them in advance, if a local copy of the repository exists. .. code-block:: bash # optional - Rscript /drop/installRPackages.R drop/requirementsR.txt + Rscript /drop/installRPackages.R drop/requirementsR.txt diff --git a/docs/source/pipeline.rst b/docs/source/pipeline.rst index 4145d901..9a0473bc 100644 --- a/docs/source/pipeline.rst +++ b/docs/source/pipeline.rst @@ -1,5 +1,3 @@ -.. _pipeline: - Pipeline Commands ================= @@ -81,10 +79,13 @@ While running, Snakemake *locks* the directory. If, for a whatever reason, the p to unlock it. This will call snakemake's ``unlock`` command for every module +.. _dropUpdate: Updating DROP +++++++++++++ -Everytime a project is initialized, a temporary folder ``.drop`` will be created in the project folder. If a new version of drop is installed, the ``.drop`` folder has to be updated for each project that has been initialized using an older version. +Every time a project is initialized, a temporary folder ``.drop`` will be created in the project folder. +If a new version of drop is installed, the ``.drop`` folder has to be updated for each project that has been +initialized using an older version. To do this run: .. code-block:: bash diff --git a/docs/source/prepare.rst b/docs/source/prepare.rst index c15ed6a6..95713613 100644 --- a/docs/source/prepare.rst +++ b/docs/source/prepare.rst @@ -1,5 +1,3 @@ -.. _prepare: - Preparing the Input Data ======================== @@ -95,6 +93,7 @@ groups list Same as in aberrant expression. minIds numeric Same as in aberrant expression. ``1`` recount boolean If true, it forces samples to be recounted. ``false`` longRead boolean Set to true only if counting Nanopore or PacBio long reads. ``false`` +keepNonStandardChrs boolean Set to true if non standard chromosomes are to be kept for further analysis. ``true`` filter boolean If false, no filter is applied. We recommend filtering. ``true`` minExpressionInOneSample numeric The minimal read count in at least one sample required for an intron to pass the filter. ``20`` minDeltaPsi numeric The minimal variation (in delta psi) required for an intron to pass the filter. ``0.05`` @@ -118,6 +117,7 @@ padjCutoff numeric Same as in aberrant expression. allelicRatioCutoff numeric A number between [0.5, 1) indicating the maximum allelic ratio allele1/(allele1+allele2) for the test to be significant. ``0.8`` addAF boolean Whether or not to add the allele frequencies from gnomAD ``true`` maxAF numeric Maximum allele frequency (of the minor allele) cut-off. Variants with AF equal or below this number are considered rare. ``0.001`` +maxVarFreqCohort numeric Maximum variant frequency among the cohort. ``0.05`` qcVcf character Full path to the vcf file used for VCF-BAM matching ``/path/to/qc_vcf.vcf.gz`` qcGroups list Same as “groups”, but for the VCF-BAM matching ``# see aberrant expression example`` ===================== ========= ======================================================================================================================== ====== @@ -172,7 +172,7 @@ Specifically, the number of threads allowed for a computational step can be modi .. note:: - DROP needs to be installed from a local directory :ref:`otherversions` using ``pip install -e `` + DROP needs to be installed from a local directory :ref:`otherversions` using ``pip install -e `` so that any changes in the code will be available in the next pipeline run. Any changes made to the R code need to be updated with ``drop update`` in the project directory.