The package PAGES is the supportive material of a webinar designed for the PAGES Early-Career Network (ECN). The goal is to give some useful pointers to explore geological data, in particular stratigraphic occurrences, and uses RStudio and packages from the tidyverse universe.
This class is modelled after Hadley Wickham’s and Garrett Grolemund’s R4DS. However, I have augmented the examples with cases from geology.
The construction of the R (R Core Team 2021) package PAGES and associated documentation was aided by the packages; devtools (Wickham, Hester, and Chang 2021), roxygen2 (Wickham, Danenberg, et al. 2020), knitr (Xie 2021, 2014, 2015), rmarkdown (Allaire et al. 2021; Xie, Allaire, and Grolemund 2018; Xie, Dervieux, and Riederer 2020), bibtex (Francois 2020), and the superb guidance in the book: R packages: organize, test, document, and share your code, by Wickham (2015). In addition, this package relies on a set of external packages from the tidyverse universe, including: dplyr (Wickham et al. 2021), tidyr (Wickham 2021), tibble (Müller and Wickham 2021), readr (Wickham and Hester 2020), magrittr (Bache and Wickham 2020), and readr (Wickham and Hester 2020). Plots are made with ggplot2 (Wickham, Chang, et al. 2020; Wickham 2016) and thematic (Sievert, Schloerke, and Cheng 2021) is used for a consistent design in the presentation.
The package marelac (Soetaert and Petzoldt 2020) is used for chemical data and transformations, and the package datasauRus (Locke and D’Agostino McGowan 2018) is used as an example for the strength of plotting data.
You can install the released version of PAGES from github with:
# Install PAGES from GitHub:
# install.packages("devtools")
devtools::install_github("MartinSchobben/PAGES", build_vignettes = TRUE)
Load PAGES with library
.
library(PAGES)
The study on the Triassic–Jurassic (~201 million years before present) boundary section of Bonenburg (Germany) and Kuhjoch (Austria) by Schobben et al. (2019) is used as the example material for this course. Lazy load datasets constitute:
-kuhjoch
- Kuchjoch is a palynological dataset where the counts have
summed for spores, pollen, aquatic and terrestrial elements.
-bonenburg
- Bonenburg is a geochemical dataset containing: elemental
analyser total organic carbon (TOC) and total nitrogen (TN), XRF element
data; Aluminium (Al), Potassium (K) and sodium (Na), as well as the
carbon isotope composition of TOC (del13Ctoc).
Raw datasets (kuhjoch_raw.csv
and bonenburg_raw.csv
) can be easily
accessed with the PAGES_example()
function and a call to the readr
function read_csv()
.
readr::read_csv(PAGES_example("kuhjoch_raw.csv"))
To render the presentation slides:
render_slides()
Details regarding the exercise and live programming during the webinar
can be found under the package vignettes (called with vignette()
).
- RStudio projects:
vignette("project", package = "PAGES")
- Exploratory data analysis:
vignette("explore", package = "PAGES")
- Patterns and models:
vignette("model", package = "PAGES")
- Load, tidy and transform data:
vignette("wrangle", package = "PAGES")
The lazy load datasets are provided in a tidy format. Look-up directory
data-raw
on the Github repository for details on the data processing.
head(bonenburg)
#> # A tibble: 6 x 11
#> section strat strat2 sampleid height CaCO3 TN del13Ctoc TOCcfb Na_Al K_Al
#> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Bonenb… Cont… Pre-e… 0 3.01 13.3 0.06 -27.5 1.16 0.0314 0.216
#> 2 Bonenb… Cont… Pre-e… 100 3.95 3.84 0.07 -27.3 0.96 0.0250 0.211
#> 3 Bonenb… Cont… Pre-e… 150 4.43 5.86 0.07 -27 1.25 0.0197 0.224
#> 4 Bonenb… Cont… Pre-e… 200 4.94 12.8 0.07 -27.8 1.52 0.0231 0.236
#> 5 Bonenb… Cont… Pre-e… 250 5.25 3.34 0.09 -27.6 2.45 0.0330 0.243
#> 6 Bonenb… Cont… Pre-e… 275 5.68 9.91 0.06 -27 1.19 0.0201 0.251
Besides wide format data the similarly named datasets with the suffix
_long
are used to generate, for example, multi-proxy stratigraphic
plots for initial data exploration.
ggplot(data = bonenburg_long) +
geom_point(mapping = aes(x = value, y = height)) +
facet_grid(cols = vars(measurement), scales = "free_x") +
theme_classic()
Data science with R
Hadley Wickham & Garrett Grolemund 2016 R for Data Science
General statistics with R
Peter Dalgaard 2008 Introduction to statistics with R
Regression with R
John Fox & Sanford Weisberg 2018 An R companion to applied regression
Mixed effect models with R
Alain Zuur et al. 2008 Mixed Effects Models and Extensions in Ecology
with R
Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2021. Rmarkdown: Dynamic Documents for r. https://CRAN.R-project.org/package=rmarkdown.
Bache, Stefan Milton, and Hadley Wickham. 2020. Magrittr: A Forward-Pipe Operator for r. https://CRAN.R-project.org/package=magrittr.
Dalgaard, Peter. 2008. Introduction to statistics with R. Edited by J Chambers, D Hand, and W. Hardle. Springer. https://doi.org/10.1201/9780429341830-12.
Fox, John, and Sanford Weisberg. 2018. An R companion to applied regression. Sage publications.
Francois, Romain. 2020. Bibtex: Bibtex Parser. https://github.com/romainfrancois/bibtex.
Locke, Steph, and Lucy D’Agostino McGowan. 2018. datasauRus: Datasets from the Datasaurus Dozen. https://CRAN.R-project.org/package=datasauRus.
Müller, Kirill, and Hadley Wickham. 2021. Tibble: Simple Data Frames. https://CRAN.R-project.org/package=tibble.
R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Schobben, Martin, Julia Gravendyck, Franziska Mangels, Ulrich Struck, Robert Bussert, Wolfram M. Kürschner, Dieter Korn, P. Martin Sander, and Martin Aberhan. 2019. “A comparative study of total organic carbon-δ13C signatures in the Triassic–Jurassic transitional beds of the Central European Basin and western Tethys shelf seas.” Newsletters on Stratigraphy 52 (4): 461–86. https://doi.org/10.1127/nos/2019/0499.
Sievert, Carson, Barret Schloerke, and Joe Cheng. 2021. Thematic: Unified and Automatic Theming of Ggplot2, Lattice, and Base r Graphics. https://CRAN.R-project.org/package=thematic.
Soetaert, Karline, and Thomas Petzoldt. 2020. Marelac: Tools for Aquatic Sciences. https://CRAN.R-project.org/package=marelac.
Wickham, Hadley. 2015. R Packages: Organize, Test, Document, and Share Your Code. O’Reilly Media, Inc. https://r-pkgs.org/.
———. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2021. Tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey Dunnington. 2020. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://CRAN.R-project.org/package=ggplot2.
Wickham, Hadley, Peter Danenberg, Gábor Csárdi, and Manuel Eugster. 2020. Roxygen2: In-Line Documentation for r. https://CRAN.R-project.org/package=roxygen2.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2021. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, and Garrett Grolemund. 2016. R for data science: import, tidy, transform, visualize, and model data. O’Reilly Media, Inc. https://r4ds.had.co.nz/index.html.
Wickham, Hadley, and Jim Hester. 2020. Readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.
Wickham, Hadley, Jim Hester, and Winston Chang. 2021. Devtools: Tools to Make Developing r Packages Easier. https://CRAN.R-project.org/package=devtools.
Xie, Yihui. 2014. “Knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC. http://www.crcpress.com/product/isbn/9781466561595.
———. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.org/knitr/.
———. 2021. Knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.org/knitr/.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.
Zuur, Alain F., Elena N. Ieno, Neil J. Walker, Anatoly A. Saveliev, and Graham M. Smith. 2008. Mixed Effects Models and Extensions in Ecology with R. https://doi.org/10.4324/9780429201271-2.