diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/git-basics.png b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/git-basics.png similarity index 100% rename from _posts/2022-12-05-a-collection-of-r-resources/img/git-basics.png rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/git-basics.png diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/github-flow.png b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/github-flow.png similarity index 100% rename from _posts/2022-12-05-a-collection-of-r-resources/img/github-flow.png rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/github-flow.png diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/traceback.png b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/traceback.png similarity index 100% rename from _posts/2022-12-05-a-collection-of-r-resources/img/traceback.png rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/img/traceback.png diff --git a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd similarity index 93% rename from _posts/2022-12-05-a-collection-of-r-resources/resources.Rmd rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd index 40756cb..9e826e7 100644 --- a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd +++ b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd @@ -2,7 +2,7 @@ title: "Class wrap up: Data analysis, tips and resources" author: - name: "Rui Fu, Kent Riemondy" -date: 2022-12-05 +date: 2023-12-14 output: distill::distill_article: self_contained: false @@ -24,11 +24,13 @@ library(here) -## Rmarkdown +## Rmarkdown and Quarto Read the [Guide to RMarkdown](https://bookdown.org/yihui/rmarkdown/) for an exhaustive description of the various formats and options for using RMarkdown documents. Note that HTML for this class were all made from Rmd, using the [distill blog format](https://rstudio.github.io/distill/) -*The Rmarkdown for this class is [on github]( https://github.com/rnabioco/bmsc-7810-pbda/blob/main/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd)* +There is also a newer format, also built by Rstudio (now named Posit) called [Quarto](https://quarto.org/). Quarto documents are very similar to RMarkdown, have broader support for additional programming languages, and will likely eventually replace the Rmarkdown format. + +*The Rmarkdown for this post is [on github]( https://github.com/rnabioco/bmsc-7810-pbda/blob/main/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.Rmd)* ### Caching @@ -127,7 +129,9 @@ path_to_file <- here("data/class3/dmel_peptides_lifecycle.csv.gz") res <- microbenchmark::microbenchmark( base = read.csv(path_to_file), readr = readr::read_csv(path_to_file), - times = 5 + data.table = data.table::fread(path_to_file), + times = 5, + unit = "ms" ) print(res, signif = 2) ``` @@ -284,7 +288,7 @@ path/to/cool_function.R argument1 argument2 ... Git is a command line tool for version control, which allows us to: -1. rolling back code to a previous state if needed +1. roll back code to a previous state if needed 2. branched development, tackling individual issues/tasks @@ -313,14 +317,14 @@ This can be handled by Rstudio as well (new tab next to `Connections` and `Build ### Put your code on GitHub -As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repo). +As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repos). If you have any interest in a career in data science/informatics, GitHub is also a common showcase of what (and how well/often) you can code. After some accumulation of code, definitely put your GitHub link on your CV/resume. Check out the quickstart from github: https://docs.github.com/en/get-started/quickstart/hello-world -### Example repos (RBI) +### Example repos - [this class](https://github.com/rnabioco/bmsc-7810-pbda) - [valr](https://github.com/rnabioco/valr) @@ -359,7 +363,7 @@ emo::ji("smile") ![](https://bioconductor.org/images/logo_bioconductor.gif) -2,000+ R packages dedicated to bioinformatics. Included a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includs many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub) +2,000+ R packages dedicated to bioinformatics. Includes a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includes many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub) - https://bioconductor.org/ - Use `BiocManager::install()` to install these packages @@ -390,8 +394,6 @@ emo::ji("smile") Rstudio links to common ones here: `Help` -> `Cheatsheets`. More are hosted online, such as for [regular expressions](https://rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf). -Useful to keep your own stash too. - ## Offline help diff --git a/_posts/2022-12-05-a-collection-of-r-resources/resources.html b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.html similarity index 79% rename from _posts/2022-12-05-a-collection-of-r-resources/resources.html rename to _posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.html index 3e40263..cf573f2 100644 --- a/_posts/2022-12-05-a-collection-of-r-resources/resources.html +++ b/_posts/2023-12-14-class-wrap-up-data-analysis-tips-and-resources/resources.html @@ -93,8 +93,8 @@ - - + + @@ -110,12 +110,12 @@ @@ -227,9 +227,21 @@ display: none !important; } + hr.section-separator { + border: none; + border-top: 1px solid rgba(0, 0, 0, 0.1); + margin: 0px; + } + + + d-byline { + border-top: none; + } + d-article { padding-top: 2.5rem; padding-bottom: 30px; + border-top: none; } d-appendix { @@ -326,6 +338,11 @@ font-size: 14px; } + /* tweak for Pandoc numbered line within distill */ + d-article pre.numberSource code > span { + left: -2em; + } + d-article pre { font-size: 14px; } @@ -1085,6 +1102,12 @@ // create d-title $('.d-title').changeElementType('d-title'); + // separator + var separator = '
Read the Guide to RMarkdown for an exhaustive description of the various formats and options for using RMarkdown documents. Note that HTML for this class were all made from Rmd, using the distill blog format
-The Rmarkdown for this class is on github
+There is also a newer format, also built by Rstudio (now named Posit) called Quarto. Quarto documents are very similar to RMarkdown, have broader support for additional programming languages, and will likely eventually replace the Rmarkdown format.
+The Rmarkdown for this post is on github
You can speed up knitting of your Rmds by using caching to store the results from each chunk, instead of rerunning them each time. Note that if you modify the code chunk, previous caching is ignored.
For each chunk, set {r, cache = TRUE}
styler
, clean up code rea
}
")
#>
-#> my_fun <- function(x,
-#> y,
-#> z) {
+#> my_fun <- function(
+#> x,
+#> y,
+#> z) {
#> x + z
#> }
@@ -1641,49 +1668,46 @@ Print environment inform
-#> R version 4.2.0 (2022-04-22)
-#> Platform: x86_64-apple-darwin17.0 (64-bit)
-#> Running under: macOS Big Sur/Monterey 10.16
+#> R version 4.3.1 (2023-06-16)
+#> Platform: aarch64-apple-darwin20 (64-bit)
+#> Running under: macOS Monterey 12.2.1
#>
#> Matrix products: default
-#> BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
-#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
+#> BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
+#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
+#> time zone: America/Denver
+#> tzcode source: internal
+#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods
#> [7] base
#>
#> other attached packages:
-#> [1] here_1.0.1 forcats_0.5.1 stringr_1.4.1 dplyr_1.0.10
-#> [5] purrr_0.3.5 readr_2.1.2 tidyr_1.2.0 tibble_3.1.8
-#> [9] ggplot2_3.3.6 tidyverse_1.3.1
+#> [1] here_1.0.1 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1
+#> [5] dplyr_1.1.4 purrr_1.0.2 readr_2.1.4 tidyr_1.3.0
+#> [9] tibble_3.2.1 ggplot2_3.4.4 tidyverse_2.0.0
#>
#> loaded via a namespace (and not attached):
-#> [1] lubridate_1.8.0 assertthat_0.2.1 rprojroot_2.0.3
-#> [4] digest_0.6.30 utf8_1.2.2 prettycode_1.1.0
-#> [7] R6_2.5.1 cellranger_1.1.0 backports_1.4.1
-#> [10] reprex_2.0.1 evaluate_0.16 httr_1.4.4
-#> [13] pillar_1.8.1 rlang_1.0.6 readxl_1.4.0
-#> [16] rstudioapi_0.13 jquerylib_0.1.4 R.utils_2.12.0
-#> [19] R.oo_1.25.0 rmarkdown_2.14 styler_1.7.0
-#> [22] munsell_0.5.0 broom_0.8.0 compiler_4.2.0
-#> [25] modelr_0.1.8 xfun_0.32 pkgconfig_2.0.3
-#> [28] htmltools_0.5.2 downlit_0.4.2 tidyselect_1.2.0
-#> [31] fansi_1.0.3 crayon_1.5.2 tzdb_0.3.0
-#> [34] dbplyr_2.2.1 withr_2.5.0 R.methodsS3_1.8.2
-#> [37] grid_4.2.0 jsonlite_1.8.3 gtable_0.3.0
-#> [40] lifecycle_1.0.3 DBI_1.1.3 magrittr_2.0.3
-#> [43] scales_1.2.0 cli_3.4.1 stringi_1.7.8
-#> [46] cachem_1.0.6 fs_1.5.2 xml2_1.3.3
-#> [49] bslib_0.3.1 ellipsis_0.3.2 generics_0.1.3
-#> [52] vctrs_0.4.1 distill_1.5 tools_4.2.0
-#> [55] R.cache_0.15.0 glue_1.6.2 hms_1.1.2
-#> [58] fastmap_1.1.0 yaml_2.3.6 colorspace_2.0-3
-#> [61] rvest_1.0.2 memoise_2.0.1 knitr_1.39
-#> [64] haven_2.5.0 sass_0.4.1
+#> [1] styler_1.10.2 sass_0.4.7 utf8_1.2.4
+#> [4] generics_0.1.3 stringi_1.8.2 distill_1.6
+#> [7] hms_1.1.3 digest_0.6.33 magrittr_2.0.3
+#> [10] evaluate_0.23 grid_4.3.1 timechange_0.2.0
+#> [13] fastmap_1.1.1 R.oo_1.25.0 R.cache_0.16.0
+#> [16] rprojroot_2.0.4 jsonlite_1.8.8 R.utils_2.12.3
+#> [19] fansi_1.0.5 scales_1.3.0 jquerylib_0.1.4
+#> [22] cli_3.6.1 rlang_1.1.2 R.methodsS3_1.8.2
+#> [25] munsell_0.5.0 withr_2.5.2 cachem_1.0.8
+#> [28] yaml_2.3.7 tools_4.3.1 tzdb_0.4.0
+#> [31] memoise_2.0.1 colorspace_2.1-0 vctrs_0.6.5
+#> [34] R6_2.5.1 lifecycle_1.0.4 pkgconfig_2.0.3
+#> [37] pillar_1.9.0 bslib_0.6.1 gtable_0.3.4
+#> [40] glue_1.6.2 xfun_0.41 tidyselect_1.2.0
+#> [43] rstudioapi_0.15.0 knitr_1.45 htmltools_0.5.7
+#> [46] rmarkdown_2.25 compiler_4.3.1 downlit_0.4.3
See also the sessioninfo
package, which provide more details:
@@ -1692,94 +1716,79 @@ Print environment inform
#> ─ Session info ─────────────────────────────────────────────────────
#> setting value
-#> version R version 4.2.0 (2022-04-22)
-#> os macOS Big Sur/Monterey 10.16
-#> system x86_64, darwin17.0
+#> version R version 4.3.1 (2023-06-16)
+#> os macOS Monterey 12.2.1
+#> system aarch64, darwin20
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Denver
-#> date 2022-12-16
-#> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
+#> date 2023-12-14
+#> pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ─────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
-#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0)
-#> backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0)
-#> broom 0.8.0 2022-04-13 [1] CRAN (R 4.2.0)
-#> bslib 0.3.1 2021-10-06 [1] CRAN (R 4.2.0)
-#> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.0)
-#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.0)
-#> cli 3.4.1 2022-09-23 [1] CRAN (R 4.2.0)
-#> colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0)
-#> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.0)
-#> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.0)
-#> dbplyr 2.2.1 2022-06-27 [1] CRAN (R 4.2.0)
-#> digest 0.6.30 2022-10-18 [1] CRAN (R 4.2.0)
-#> distill 1.5 2022-09-07 [1] CRAN (R 4.2.0)
-#> downlit 0.4.2 2022-07-05 [1] CRAN (R 4.2.0)
-#> dplyr * 1.0.10 2022-09-01 [1] CRAN (R 4.2.0)
-#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
-#> evaluate 0.16 2022-08-09 [1] CRAN (R 4.2.0)
-#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
-#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
-#> forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.2.0)
-#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
-#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
-#> ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0)
-#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
-#> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.2.0)
-#> haven 2.5.0 2022-04-15 [1] CRAN (R 4.2.0)
-#> here * 1.0.1 2020-12-13 [1] CRAN (R 4.2.0)
-#> hms 1.1.2 2022-08-19 [1] CRAN (R 4.2.0)
-#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
-#> httr 1.4.4 2022-08-17 [1] CRAN (R 4.2.0)
-#> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.0)
-#> jsonlite 1.8.3 2022-10-21 [1] CRAN (R 4.2.0)
-#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
-#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0)
-#> lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.2.0)
-#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
-#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0)
-#> modelr 0.1.8 2020-05-19 [1] CRAN (R 4.2.0)
-#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
-#> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.0)
-#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
-#> prettycode 1.1.0 2019-12-16 [1] CRAN (R 4.2.0)
-#> purrr * 0.3.5 2022-10-06 [1] CRAN (R 4.2.0)
-#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.2.0)
-#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0)
-#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0)
-#> R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.0)
-#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
-#> readr * 2.1.2 2022-01-30 [1] CRAN (R 4.2.0)
-#> readxl 1.4.0 2022-03-28 [1] CRAN (R 4.2.0)
-#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
-#> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.0)
-#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
-#> rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.0)
-#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
-#> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.2.0)
-#> sass 0.4.1 2022-03-23 [1] CRAN (R 4.2.0)
-#> scales 1.2.0 2022-04-13 [1] CRAN (R 4.2.0)
-#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
-#> stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.0)
-#> stringr * 1.4.1 2022-08-20 [1] CRAN (R 4.2.0)
-#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0)
-#> tibble * 3.1.8 2022-07-22 [1] CRAN (R 4.2.0)
-#> tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.2.0)
-#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0)
-#> tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.2.0)
-#> tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0)
-#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
-#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
-#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
-#> xfun 0.32 2022-08-10 [1] CRAN (R 4.2.0)
-#> xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.0)
-#> yaml 2.3.6 2022-10-18 [1] CRAN (R 4.2.0)
+#> bslib 0.6.1 2023-11-28 [1] CRAN (R 4.3.1)
+#> cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.0)
+#> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
+#> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
+#> digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
+#> distill 1.6 2023-10-06 [1] CRAN (R 4.3.1)
+#> downlit 0.4.3 2023-06-29 [1] CRAN (R 4.3.0)
+#> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.1)
+#> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.1)
+#> fansi 1.0.5 2023-10-08 [1] CRAN (R 4.3.1)
+#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
+#> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
+#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
+#> ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.3.1)
+#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
+#> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.0)
+#> here * 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
+#> hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
+#> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.1)
+#> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.3.0)
+#> jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.3.1)
+#> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.1)
+#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1)
+#> lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.3.1)
+#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
+#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.0)
+#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
+#> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
+#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
+#> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
+#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0)
+#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0)
+#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0)
+#> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.3.1)
+#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
+#> readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.0)
+#> rlang 1.1.2 2023-11-04 [1] CRAN (R 4.3.1)
+#> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1)
+#> rprojroot 2.0.4 2023-11-05 [1] CRAN (R 4.3.1)
+#> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
+#> sass 0.4.7 2023-07-15 [1] CRAN (R 4.3.0)
+#> scales 1.3.0 2023-11-28 [1] CRAN (R 4.3.1)
+#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
+#> stringi 1.8.2 2023-11-23 [1] CRAN (R 4.3.1)
+#> stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.3.1)
+#> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.0)
+#> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
+#> tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
+#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
+#> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
+#> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0)
+#> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
+#> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.1)
+#> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.1)
+#> withr 2.5.2 2023-10-30 [1] CRAN (R 4.3.1)
+#> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.1)
+#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
#>
-#> [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
+#> [1] /Users/kriemo/Library/R/arm64/4.3/library
+#> [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
#>
#> ────────────────────────────────────────────────────────────────────
@@ -1800,14 +1809,17 @@ Benchmarking, with m
res <- microbenchmark::microbenchmark(
base = read.csv(path_to_file),
readr = readr::read_csv(path_to_file),
- times = 5
+ data.table = data.table::fread(path_to_file),
+ times = 5,
+ unit = "ms"
)
print(res, signif = 2)
#> Unit: milliseconds
-#> expr min lq mean median uq max neval
-#> base 3300 3500 3500 3600 3600 3700 5
-#> readr 280 290 370 310 410 560 5
+#> expr min lq mean median uq max neval
+#> base 750 760 800 800 810 880 5
+#> readr 180 190 230 230 260 300 5
+#> data.table 140 190 190 200 210 210 5
m
})
p
R has a debugger built in. You can debug a function:
@@ -1872,7 +1884,7 @@Check out jsonlite
library(jsonlite)
+library(jsonlite)
json_file <- "http://api.worldbank.org/country?per_page=10®ion=OED&lendingtype=LNX&format=json"
worldbank_data <- fromJSON(json_file, flatten=TRUE)
worldbank_data
@@ -1977,9 +1989,9 @@ Using R on the command-line
R -e "print('hello')"
#>
-#> R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
-#> Copyright (C) 2022 The R Foundation for Statistical Computing
-#> Platform: x86_64-apple-darwin17.0 (64-bit)
+#> R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
+#> Copyright (C) 2023 The R Foundation for Statistical Computing
+#> Platform: aarch64-apple-darwin20 (64-bit)
#>
#> R is free software and comes with ABSOLUTELY NO WARRANTY.
#> You are welcome to redistribute it under certain conditions.
@@ -2020,7 +2032,7 @@ Git and Github
Git is a command line tool for version control, which allows us to:
-rolling back code to a previous state if needed
+roll back code to a previous state if needed
branched development, tackling individual issues/tasks
collaboration
@@ -2042,11 +2054,11 @@ Git and Github
This can be handled by Rstudio as well (new tab next to Connections
and Build
)
Put your code on GitHub
-As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repo).
+As you write more code, especially as functions and script pipelines, hosting and documenting them on GitHub is great way to make them portable and searchable. Even the free tier of GitHub accounts now has private repositories (repos).
If you have any interest in a career in data science/informatics, GitHub is also a common showcase of what (and how well/often) you can code. After some accumulation of code, definitely put your GitHub link on your CV/resume.
Check out the quickstart from github:
https://docs.github.com/en/get-started/quickstart/hello-world
-Example repos (RBI)
+Example repos
2,000+ R packages dedicated to bioinformatics. Included a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includs many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub)
+2,000+ R packages dedicated to bioinformatics. Includes a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includes many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub)
Find out if others are having similar issues by searching the issue on the package GitHub page.
Rstudio links to common ones here: Help
-> Cheatsheets
. More are hosted online, such as for regular expressions.
Useful to keep your own stash too.
The RBI fellows hold standing office hours on Thursdays over zoom. We are happy to help out with coding and RNA/DNA-related informatics questions. Send us an email to schedule a time (rbi.fellows@cuanschutz.edu
).