Skip to content

iprada/circrna_snakemake

Repository files navigation

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Author: Leire Iparragirre, Inigo Prada and David Otaegi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Project directory structure"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### src\n",
    "\n",
    "The folder src (source code) contains 4 files:\n",
    "  - circrna_sbatch.sh: This file is designed to run the pipeline in a High performance computing cluster. It will need modifications to dump it out from the GenomeDK cluster\n",
    "  - circ_rna_snakemake.snk: This file is the most important one, it will orchestrate all the steps required to go from the input to the output\n",
    "  \n",
    "  It also contains two source code for generating two notebooks:\n",
    "  \n",
    "  -notebook.ipynb: Explains all the steps of the pipeline\n",
    "  -readme.ipynb: The file you are reading at this moment\n",
    "  \n",
    "**What should I do if I mess up any of this files**:\n",
    "\n",
    "Send to Inigo the whole folder **src**. It contains a git repository tracking all the changes you are doing to this files, so Inigo can try to revert the changes. If you  know how to use git, you are more than welcome to add commits.\n",
    "\n",
    "\n",
    "#### working_directory:\n",
    "\n",
    "This directory is where the pipeline will run:\n",
    "\n",
    "It is a **MUST** that the this directory is organized in he following way: \n",
    "\n",
    "Inside this directory, there has **only and only** a folder for each sample: e.g: ms_1\n",
    "\n",
    "Inside that folder,you have to put the compressed sequencing files, and they need to be named according to the folder name:\n",
    " for example:\n",
    " \n",
    "     -ms1_1.tar.gz\n",
    "     -ms1_2.tar.gz\n",
    "\n",
    "If you don't follow the above mentioned structure the pipeline won't work\n",
    "\n",
    "There is a hidden folder containing the reference genome and the genome annotation file, the folder is called .reference and you can see it by running *ls -a* in the command line\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How to run it"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Go to the directory *src*, and there, run the following command:\n",
    "\n",
    "*snakemake --snakefile circ_rna_snakemake.snk --latency-wait 30 --dir /home/iprada/faststorage/projects/circrna_snakemake/working_directory/ --cores 6*\n",
    "\n",
    "Thats all :)\n",
    "\n",
    "## Output organization:\n",
    "\n",
    "In each sample directory:\n",
    "\n",
    "The pipeline will generate all the required outputs. More specifically, it will generate the following directories:\n",
    "\n",
    "{sample} means all the samples\n",
    "\n",
    "-circexplorer_out:\n",
    "    {sample}_known_circrna.bed: The important file! the one containing circRNA\n",
    "    {sample}_back_spliced_junction.bed: Required for generating the above file\n",
    "    \n",
    "\n",
    "-fastqc:\n",
    "   Contains html and zip reports of the sequencing quality control\n",
    "   \n",
    "-star_out:\n",
    "   All the STAR outputs, required for generating circexplorer_out files\n",
    "   \n",
    "The pipeline will also generate a file called \n",
    "\n",
    "-{sample}_align_stats.txt: contains the output of samtools flagstat\n",
    "\n",
    "\n",
    "** What should I do if somenthing does not work**\n",
    "\n",
    "First, you should look in the snakemake output (what running snakemake --snakefile  \\ldots) outputs to the screen) which sample has failed. Inside there, there is a hidden directory called .logs. You should send that output to Inigo and he will try to figure out what he has programmed wrong\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published