Skip to content

Latest commit

 

History

History
43 lines (34 loc) · 8.81 KB

File metadata and controls

43 lines (34 loc) · 8.81 KB



Elements-of-Style-Reproducible-Workflow-Creation-Maintenance-Tutorial





Agenda for the day:

Time (UTC) Programme
11.00 - 11.10 Welcome Address and Presentation of Tutorial Agenda
11.10 - 11.20 1. A few simple rules for easier workflow maintenance and reuse
11.20 - 11.30 2. A quck run of scRNA-seq workflow
11.30 - 11.40 3. Begin with environment setup - Conda
11:40 - 12.00 4. Dockerfile for each process - Docker
12:00 - 12:20 5. GitHub Actions to build, test and deposit container images
12.20 - 12.30 6. Zenodo for DOIs and Genomic Summary and other Data
12.30 - 12.45 Short break - Stretch your legs! (15 minutes)
12.45 - 13.00 7. Stitching processes with a standard workflow - Nextflow
13.00 - 13.20 8. Stitching processes with other workflow languages, such as Common Workflow Language
13.50 - 14.00 Closing remarks and future directions


Background Information and other Topics of Interest

Command Line Skills Reading and Plotting Data in R Why Git and GitHub Forking in GitHub
Git Add Git Commit Nextflow Nextflow patterns Finding Data
Conda for managing dependencies Docker Build Test Share Reuse Getting End-to-End Example Data GitHub Hello World Fun
SRA and RNAseq Variant Calling Sarek End-to-End Example Dry Bench Skills Recap
Using GitHub Actions Using Zenodo Long Read Proteogenomics The Impact of Sex on Algernative Splicing (rMATS, Papermill, JupyterLab notebooks)

Tutorials Given:

2021 November 26 - ISCB Academy Sponsored - 3 hour Tutorial Agenda

About

In a short 3 hour course, the learner learned elements of style in the construction and containerization of small single-function processes that facilitate reproducible workflow creation and execution. This hands-on-tutorial was given through a webinar with the ISCB Academy. This repository was used in the course and contains self-learnings to facilitate work. In this repository, contains how these processes may be kept up-to-date and alert the creator to the functional state of these processes (working or failing) by using a feature found within GitHub called GitHub Actions. This hands-on-course will use a small example to provide the structure, philosophy and approach to achieving this desirable outcome. This course seeks to help to demystify and make accessible powerful methods one can use to achieve platform independence and platform interoperability. Using a simple single cell RNASeq pre-baked analysis example to demonstrate these techniques, we will break down and walk the learner through each of the construction steps. The learners will be introduced to Conda, Docker, GitHub and the standard workflow language, Nextflow. If time permits, we will also show how these containerized processes can also be represented in a second standard workflow language implementation (e.g. Common Workflow Language or WDL). By the end of the course, the learner will understand these Elements of Style and will know how Conda, Docker, GitHub, Zenodo, and Nextflow enable reproducible research. Moreover, these steps will be on GitHub for the Learner to return to and reproduce themselves after the end of the course. In taking this course, the Learner will also be shown the power of JupyterLab notebooks to facilitate literate programming. Through their participation in the class, learners will learn and understand FAIR (findability, accessibility, interoperability and reusability) best practices. We ask all participants to get a GitHub, Zenodo and ORCID accounts prior to the course. We ask for minimal background knowledge of the command line, simple commands in the shell environment, we enable a bit of self-learning from the repository to facilitate the acquisition of this knowledge.