diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/git-basics.png b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/git-basics.png similarity index 100% rename from _posts/2022-12-05-a-collection-of-r-resources/img/git-basics.png rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/git-basics.png diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/github-flow.png b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/github-flow.png similarity index 100% rename from _posts/2022-12-05-a-collection-of-r-resources/img/github-flow.png rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/github-flow.png diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/traceback.png b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/traceback.png similarity index 100% rename from _posts/2022-12-05-a-collection-of-r-resources/img/traceback.png rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/traceback.png diff --git a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd similarity index 93% rename from _posts/2022-12-05-a-collection-of-r-resources/resources.Rmd rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd index 40756cb..9e826e7 100644 --- a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd +++ b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd @@ -2,7 +2,7 @@ title: "Class wrap up: Data analysis, tips and resources" author: - name: "Rui Fu, Kent Riemondy" -date: 2022-12-05 +date: 2023-12-14 output: distill::distill_article: self_contained: false @@ -24,11 +24,13 @@ library(here) -## Rmarkdown +## Rmarkdown and Quarto Read the [Guide to RMarkdown](https://bookdown.org/yihui/rmarkdown/) for an exhaustive description of the various formats and options for using RMarkdown documents. Note that HTML for this class were all made from Rmd, using the [distill blog format](https://rstudio.github.io/distill/) -*The Rmarkdown for this class is [on github]( https://github.com/rnabioco/bmsc-7810-pbda/blob/main/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd)* +There is also a newer format, also built by Rstudio (now named Posit) called [Quarto](https://quarto.org/). Quarto documents are very similar to RMarkdown, have broader support for additional programming languages, and will likely eventually replace the Rmarkdown format. + +*The Rmarkdown for this post is [on github]( https://github.com/rnabioco/bmsc-7810-pbda/blob/main/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd)* ### Caching @@ -127,7 +129,9 @@ path_to_file <- here("data/class3/dmel_peptides_lifecycle.csv.gz") res <- microbenchmark::microbenchmark( base = read.csv(path_to_file), readr = readr::read_csv(path_to_file), - times = 5 + data.table = data.table::fread(path_to_file), + times = 5, + unit = "ms" ) print(res, signif = 2) ``` @@ -284,7 +288,7 @@ path/to/cool_function.R argument1 argument2 ... Git is a command line tool for version control, which allows us to: -1. rolling back code to a previous state if needed +1. roll back code to a previous state if needed 2. branched development, tackling individual issues/tasks @@ -313,14 +317,14 @@ This can be handled by Rstudio as well (new tab next to `Connections` and `Build ### Put your code on GitHub -As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repo). +As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repos). If you have any interest in a career in data science/informatics, GitHub is also a common showcase of what (and how well/often) you can code. After some accumulation of code, definitely put your GitHub link on your CV/resume. Check out the quickstart from github: https://docs.github.com/en/get-started/quickstart/hello-world -### Example repos (RBI) +### Example repos - [this class](https://github.com/rnabioco/bmsc-7810-pbda) - [valr](https://github.com/rnabioco/valr) @@ -359,7 +363,7 @@ emo::ji("smile") ![](https://bioconductor.org/images/logo_bioconductor.gif) -2,000+ R packages dedicated to bioinformatics. Included a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includs many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub) +2,000+ R packages dedicated to bioinformatics. Includes a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includes many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub) - https://bioconductor.org/ - Use `BiocManager::install()` to install these packages @@ -390,8 +394,6 @@ emo::ji("smile") Rstudio links to common ones here: `Help` -> `Cheatsheets`. More are hosted online, such as for [regular expressions](https://rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf). -Useful to keep your own stash too. - ## Offline help diff --git a/_posts/2022-12-05-a-collection-of-r-resources/resources.html b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.html similarity index 79% rename from _posts/2022-12-05-a-collection-of-r-resources/resources.html rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.html index 3e40263..cf573f2 100644 --- a/_posts/2022-12-05-a-collection-of-r-resources/resources.html +++ b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.html @@ -93,8 +93,8 @@ - - + + @@ -110,12 +110,12 @@ @@ -227,9 +227,21 @@ display: none !important; } + hr.section-separator { + border: none; + border-top: 1px solid rgba(0, 0, 0, 0.1); + margin: 0px; + } + + + d-byline { + border-top: none; + } + d-article { padding-top: 2.5rem; padding-bottom: 30px; + border-top: none; } d-appendix { @@ -326,6 +338,11 @@ font-size: 14px; } + /* tweak for Pandoc numbered line within distill */ + d-article pre.numberSource code > span { + left: -2em; + } + d-article pre { font-size: 14px; } @@ -1085,6 +1102,12 @@ // create d-title $('.d-title').changeElementType('d-title'); + // separator + var separator = '
'; + // prepend separator above appendix + $('.d-byline').before(separator); + $('.d-article').before(separator); + // create d-byline var byline = $(''); $('.d-byline').replaceWith(byline); @@ -1162,8 +1185,9 @@ $('.distill-force-highlighting-css').parent().remove(); // remove empty line numbers inserted by pandoc when using a - // custom syntax highlighting theme - $('code.sourceCode a:empty').remove(); + // custom syntax highlighting theme, except when numbering line + // in code chunk + $('pre:not(.numberLines) code.sourceCode a:empty').remove(); // load distill framework load_distill_framework(); @@ -1189,12 +1213,13 @@ // add orcid ids $('.authors-affiliations').find('.author').each(function(i, el) { var orcid_id = front_matter.authors[i].orcidID; + var author_name = front_matter.authors[i].author if (orcid_id) { var a = $(''); a.attr('href', 'https://orcid.org/' + orcid_id); var img = $(''); img.addClass('orcid-id'); - img.attr('alt', 'ORCID ID'); + img.attr('alt', author_name ? 'ORCID ID for ' + author_name : 'ORCID ID'); img.attr('src','data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAA2ZpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdpbj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMC1jMDYwIDYxLjEzNDc3NywgMjAxMC8wMi8xMi0xNzozMjowMCAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wTU09Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9tbS8iIHhtbG5zOnN0UmVmPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvc1R5cGUvUmVzb3VyY2VSZWYjIiB4bWxuczp4bXA9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC8iIHhtcE1NOk9yaWdpbmFsRG9jdW1lbnRJRD0ieG1wLmRpZDo1N0NEMjA4MDI1MjA2ODExOTk0QzkzNTEzRjZEQTg1NyIgeG1wTU06RG9jdW1lbnRJRD0ieG1wLmRpZDozM0NDOEJGNEZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wTU06SW5zdGFuY2VJRD0ieG1wLmlpZDozM0NDOEJGM0ZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wOkNyZWF0b3JUb29sPSJBZG9iZSBQaG90b3Nob3AgQ1M1IE1hY2ludG9zaCI+IDx4bXBNTTpEZXJpdmVkRnJvbSBzdFJlZjppbnN0YW5jZUlEPSJ4bXAuaWlkOkZDN0YxMTc0MDcyMDY4MTE5NUZFRDc5MUM2MUUwNEREIiBzdFJlZjpkb2N1bWVudElEPSJ4bXAuZGlkOjU3Q0QyMDgwMjUyMDY4MTE5OTRDOTM1MTNGNkRBODU3Ii8+IDwvcmRmOkRlc2NyaXB0aW9uPiA8L3JkZjpSREY+IDwveDp4bXBtZXRhPiA8P3hwYWNrZXQgZW5kPSJyIj8+84NovQAAAR1JREFUeNpiZEADy85ZJgCpeCB2QJM6AMQLo4yOL0AWZETSqACk1gOxAQN+cAGIA4EGPQBxmJA0nwdpjjQ8xqArmczw5tMHXAaALDgP1QMxAGqzAAPxQACqh4ER6uf5MBlkm0X4EGayMfMw/Pr7Bd2gRBZogMFBrv01hisv5jLsv9nLAPIOMnjy8RDDyYctyAbFM2EJbRQw+aAWw/LzVgx7b+cwCHKqMhjJFCBLOzAR6+lXX84xnHjYyqAo5IUizkRCwIENQQckGSDGY4TVgAPEaraQr2a4/24bSuoExcJCfAEJihXkWDj3ZAKy9EJGaEo8T0QSxkjSwORsCAuDQCD+QILmD1A9kECEZgxDaEZhICIzGcIyEyOl2RkgwAAhkmC+eAm0TAAAAABJRU5ErkJggg=='); a.append(img); $(this).append(a); @@ -1472,8 +1497,8 @@ - - + + @@ -1481,7 +1506,7 @@ - + @@ -1501,7 +1526,7 @@ @@ -1512,6 +1537,7 @@

Class wrap up: Data analysis, tips and resources

+ @@ -1520,7 +1546,7 @@

Class wrap up: Data analysis, tips and resources

Rui Fu, Kent Riemondy -
2022-12-05 +
2023-12-14
@@ -1528,7 +1554,7 @@

Class wrap up: Data analysis, tips and resources

-

Rmarkdown

+

Rmarkdown and Quarto

Read the Guide to RMarkdown for an exhaustive description of the various formats and options for using RMarkdown documents. Note that HTML for this class were all made from Rmd, using the distill blog format

-

The Rmarkdown for this class is on github

+

There is also a newer format, also built by Rstudio (now named Posit) called Quarto. Quarto documents are very similar to RMarkdown, have broader support for additional programming languages, and will likely eventually replace the Rmarkdown format.

+

The Rmarkdown for this post is on github

Caching

You can speed up knitting of your Rmds by using caching to store the results from each chunk, instead of rerunning them each time. Note that if you modify the code chunk, previous caching is ignored.

For each chunk, set {r, cache = TRUE}

@@ -1618,10 +1645,10 @@

styler, clean up code rea } ")

-
#> 
-#> my_fun <- function(x,
-#>                    y,
-#>                    z) {
+
#> my_fun <- function(
+#>     x,
+#>     y,
+#>     z) {
 #>   x + z
 #> }
@@ -1641,49 +1668,46 @@

See also the sessioninfo package, which provide more details:

@@ -1692,94 +1716,79 @@
#> ─ Session info ─────────────────────────────────────────────────────
 #>  setting  value
-#>  version  R version 4.2.0 (2022-04-22)
-#>  os       macOS Big Sur/Monterey 10.16
-#>  system   x86_64, darwin17.0
+#>  version  R version 4.3.1 (2023-06-16)
+#>  os       macOS Monterey 12.2.1
+#>  system   aarch64, darwin20
 #>  ui       X11
 #>  language (EN)
 #>  collate  en_US.UTF-8
 #>  ctype    en_US.UTF-8
 #>  tz       America/Denver
-#>  date     2022-12-16
-#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
+#>  date     2023-12-14
+#>  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
 #> 
 #> ─ Packages ─────────────────────────────────────────────────────────
 #>  package     * version date (UTC) lib source
-#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.2.0)
-#>  backports     1.4.1   2021-12-13 [1] CRAN (R 4.2.0)
-#>  broom         0.8.0   2022-04-13 [1] CRAN (R 4.2.0)
-#>  bslib         0.3.1   2021-10-06 [1] CRAN (R 4.2.0)
-#>  cachem        1.0.6   2021-08-19 [1] CRAN (R 4.2.0)
-#>  cellranger    1.1.0   2016-07-27 [1] CRAN (R 4.2.0)
-#>  cli           3.4.1   2022-09-23 [1] CRAN (R 4.2.0)
-#>  colorspace    2.0-3   2022-02-21 [1] CRAN (R 4.2.0)
-#>  crayon        1.5.2   2022-09-29 [1] CRAN (R 4.2.0)
-#>  DBI           1.1.3   2022-06-18 [1] CRAN (R 4.2.0)
-#>  dbplyr        2.2.1   2022-06-27 [1] CRAN (R 4.2.0)
-#>  digest        0.6.30  2022-10-18 [1] CRAN (R 4.2.0)
-#>  distill       1.5     2022-09-07 [1] CRAN (R 4.2.0)
-#>  downlit       0.4.2   2022-07-05 [1] CRAN (R 4.2.0)
-#>  dplyr       * 1.0.10  2022-09-01 [1] CRAN (R 4.2.0)
-#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.2.0)
-#>  evaluate      0.16    2022-08-09 [1] CRAN (R 4.2.0)
-#>  fansi         1.0.3   2022-03-24 [1] CRAN (R 4.2.0)
-#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
-#>  forcats     * 0.5.1   2021-01-27 [1] CRAN (R 4.2.0)
-#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.2.0)
-#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.2.0)
-#>  ggplot2     * 3.3.6   2022-05-03 [1] CRAN (R 4.2.0)
-#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
-#>  gtable        0.3.0   2019-03-25 [1] CRAN (R 4.2.0)
-#>  haven         2.5.0   2022-04-15 [1] CRAN (R 4.2.0)
-#>  here        * 1.0.1   2020-12-13 [1] CRAN (R 4.2.0)
-#>  hms           1.1.2   2022-08-19 [1] CRAN (R 4.2.0)
-#>  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.2.0)
-#>  httr          1.4.4   2022-08-17 [1] CRAN (R 4.2.0)
-#>  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.2.0)
-#>  jsonlite      1.8.3   2022-10-21 [1] CRAN (R 4.2.0)
-#>  knitr         1.39    2022-04-26 [1] CRAN (R 4.2.0)
-#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.0)
-#>  lubridate     1.8.0   2021-10-07 [1] CRAN (R 4.2.0)
-#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
-#>  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.2.0)
-#>  modelr        0.1.8   2020-05-19 [1] CRAN (R 4.2.0)
-#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.2.0)
-#>  pillar        1.8.1   2022-08-19 [1] CRAN (R 4.2.0)
-#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
-#>  prettycode    1.1.0   2019-12-16 [1] CRAN (R 4.2.0)
-#>  purrr       * 0.3.5   2022-10-06 [1] CRAN (R 4.2.0)
-#>  R.cache       0.15.0  2021-04-30 [1] CRAN (R 4.2.0)
-#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
-#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
-#>  R.utils       2.12.0  2022-06-28 [1] CRAN (R 4.2.0)
-#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
-#>  readr       * 2.1.2   2022-01-30 [1] CRAN (R 4.2.0)
-#>  readxl        1.4.0   2022-03-28 [1] CRAN (R 4.2.0)
-#>  reprex        2.0.1   2021-08-05 [1] CRAN (R 4.2.0)
-#>  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.2.0)
-#>  rmarkdown     2.14    2022-04-25 [1] CRAN (R 4.2.0)
-#>  rprojroot     2.0.3   2022-04-02 [1] CRAN (R 4.2.0)
-#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.2.0)
-#>  rvest         1.0.2   2021-10-16 [1] CRAN (R 4.2.0)
-#>  sass          0.4.1   2022-03-23 [1] CRAN (R 4.2.0)
-#>  scales        1.2.0   2022-04-13 [1] CRAN (R 4.2.0)
-#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
-#>  stringi       1.7.8   2022-07-11 [1] CRAN (R 4.2.0)
-#>  stringr     * 1.4.1   2022-08-20 [1] CRAN (R 4.2.0)
-#>  styler        1.7.0   2022-03-13 [1] CRAN (R 4.2.0)
-#>  tibble      * 3.1.8   2022-07-22 [1] CRAN (R 4.2.0)
-#>  tidyr       * 1.2.0   2022-02-01 [1] CRAN (R 4.2.0)
-#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.2.0)
-#>  tidyverse   * 1.3.1   2021-04-15 [1] CRAN (R 4.2.0)
-#>  tzdb          0.3.0   2022-03-28 [1] CRAN (R 4.2.0)
-#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.2.0)
-#>  vctrs         0.4.1   2022-04-13 [1] CRAN (R 4.2.0)
-#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
-#>  xfun          0.32    2022-08-10 [1] CRAN (R 4.2.0)
-#>  xml2          1.3.3   2021-11-30 [1] CRAN (R 4.2.0)
-#>  yaml          2.3.6   2022-10-18 [1] CRAN (R 4.2.0)
+#>  bslib         0.6.1   2023-11-28 [1] CRAN (R 4.3.1)
+#>  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.0)
+#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.0)
+#>  colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.3.0)
+#>  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.0)
+#>  distill       1.6     2023-10-06 [1] CRAN (R 4.3.1)
+#>  downlit       0.4.3   2023-06-29 [1] CRAN (R 4.3.0)
+#>  dplyr       * 1.1.4   2023-11-17 [1] CRAN (R 4.3.1)
+#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.1)
+#>  fansi         1.0.5   2023-10-08 [1] CRAN (R 4.3.1)
+#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
+#>  forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.3.0)
+#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
+#>  ggplot2     * 3.4.4   2023-10-12 [1] CRAN (R 4.3.1)
+#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.0)
+#>  gtable        0.3.4   2023-08-21 [1] CRAN (R 4.3.0)
+#>  here        * 1.0.1   2020-12-13 [1] CRAN (R 4.3.0)
+#>  hms           1.1.3   2023-03-21 [1] CRAN (R 4.3.0)
+#>  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.1)
+#>  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.3.0)
+#>  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.1)
+#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.1)
+#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.1)
+#>  lubridate   * 1.9.3   2023-09-27 [1] CRAN (R 4.3.1)
+#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
+#>  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.3.0)
+#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.3.0)
+#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
+#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
+#>  purrr       * 1.0.2   2023-08-10 [1] CRAN (R 4.3.0)
+#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.0)
+#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
+#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.3.0)
+#>  R.utils       2.12.3  2023-11-18 [1] CRAN (R 4.3.1)
+#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
+#>  readr       * 2.1.4   2023-02-10 [1] CRAN (R 4.3.0)
+#>  rlang         1.1.2   2023-11-04 [1] CRAN (R 4.3.1)
+#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
+#>  rprojroot     2.0.4   2023-11-05 [1] CRAN (R 4.3.1)
+#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.0)
+#>  sass          0.4.7   2023-07-15 [1] CRAN (R 4.3.0)
+#>  scales        1.3.0   2023-11-28 [1] CRAN (R 4.3.1)
+#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
+#>  stringi       1.8.2   2023-11-23 [1] CRAN (R 4.3.1)
+#>  stringr     * 1.5.1   2023-11-14 [1] CRAN (R 4.3.1)
+#>  styler        1.10.2  2023-08-29 [1] CRAN (R 4.3.0)
+#>  tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
+#>  tidyr       * 1.3.0   2023-01-24 [1] CRAN (R 4.3.0)
+#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
+#>  tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.3.0)
+#>  timechange    0.2.0   2023-01-11 [1] CRAN (R 4.3.0)
+#>  tzdb          0.4.0   2023-05-12 [1] CRAN (R 4.3.0)
+#>  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.1)
+#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.1)
+#>  withr         2.5.2   2023-10-30 [1] CRAN (R 4.3.1)
+#>  xfun          0.41    2023-11-01 [1] CRAN (R 4.3.1)
+#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.0)
 #> 
-#>  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
+#>  [1] /Users/kriemo/Library/R/arm64/4.3/library
+#>  [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
 #> 
 #> ────────────────────────────────────────────────────────────────────
@@ -1800,14 +1809,17 @@

Benchmarking, with m res <- microbenchmark::microbenchmark( base = read.csv(path_to_file), readr = readr::read_csv(path_to_file), - times = 5 + data.table = data.table::fread(path_to_file), + times = 5, + unit = "ms" ) print(res, signif = 2)

#> Unit: milliseconds
-#>   expr  min   lq mean median   uq  max neval
-#>   base 3300 3500 3500   3600 3600 3700     5
-#>  readr  280  290  370    310  410  560     5
+#> expr min lq mean median uq max neval +#> base 750 760 800 800 810 880 5 +#> readr 180 190 230 230 260 300 5 +#> data.table 140 190 190 200 210 210 5
@@ -1822,8 +1834,8 @@

Benchmarking, with m }) p

-
- +
+

Debugging R code

R has a debugger built in. You can debug a function:

@@ -1872,7 +1884,7 @@

JSON

Check out jsonlite

-
library(jsonlite)
+
library(jsonlite)
 json_file <- "http://api.worldbank.org/country?per_page=10&region=OED&lendingtype=LNX&format=json"
 worldbank_data <- fromJSON(json_file, flatten=TRUE)
 worldbank_data
@@ -1977,9 +1989,9 @@

Using R on the command-line

R -e "print('hello')"
 #> 
-#> R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
-#> Copyright (C) 2022 The R Foundation for Statistical Computing
-#> Platform: x86_64-apple-darwin17.0 (64-bit)
+#> R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
+#> Copyright (C) 2023 The R Foundation for Statistical Computing
+#> Platform: aarch64-apple-darwin20 (64-bit)
 #> 
 #> R is free software and comes with ABSOLUTELY NO WARRANTY.
 #> You are welcome to redistribute it under certain conditions.
@@ -2020,7 +2032,7 @@ 

Git and Github

Git is a command line tool for version control, which allows us to:

    -
  1. rolling back code to a previous state if needed

  2. +
  3. roll back code to a previous state if needed

  4. branched development, tackling individual issues/tasks

  5. collaboration

@@ -2042,11 +2054,11 @@

Git and Github

This can be handled by Rstudio as well (new tab next to Connections and Build)

Put your code on GitHub

-

As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repo).

+

As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repos).

If you have any interest in a career in data science/informatics, GitHub is also a common showcase of what (and how well/often) you can code. After some accumulation of code, definitely put your GitHub link on your CV/resume.

Check out the quickstart from github: https://docs.github.com/en/get-started/quickstart/hello-world

-

Example repos (RBI)

+

Example repos

  • this class
  • valr
  • @@ -2068,7 +2080,7 @@

    Finding useful packages

    vignette("Gviz") # install.packages("eulerr") # from CRAN -plot(eulerr::euler(list(set1 = c("geneA", "geneB", "geneC"), +plot(eulerr::euler(list(set1 = c("geneA", "geneB", "geneC"), set2 = c("geneC", "geneD"))))
@@ -2081,7 +2093,7 @@

Finding useful packages

Bioconductor

-

2,000+ R packages dedicated to bioinformatics. Included a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includs many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub)

+

2,000+ R packages dedicated to bioinformatics. Includes a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includes many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub)