Yuxiao Luo 11/19/2021
This is a short tutorial to introduce R Markdown for someone who has experience in R and zero-experience in Markdown, and will discuss the basic Markdown syntax and essential features in R Markdown. Fore more details and systematic learning purpose, I recommend you read this book to know everything about R Markdown: R Markdown: The Definite Guide.
Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. It is easy to format headings, bold text, italics, etc. There is a quick reference guide as shown in Figure 1 and links to the RStudio cheatsheets (a Markdown cheatsheet is included) can be found in the Help drop-down menu.
Markdown file (with a suffix .md
) can be created and written in many
text editors, ex., Notedpad++, Visual
Studio Code, Windows
Notepad, and Terminal
Window.
In short, Markdown files can be seamlessly opened and organized in
almost any editors. In this tutorial, we are going to create and edit a
Markdown file in RStudio.
Let’s walk through the basic Markdown syntax together and try to create a course Syllabus in Markdown. Here are some of the hints:
- Emphasis:
*
or**
- Headers:
#
,##
, and###
- Lists:
- Unordered list:
*
,-
,+
, etc.. - Ordered list:
1
,2
, etc..
- Unordered list:
- Manual Line Breaks: end a line with two or more spaces
- Links:
[My personal website](https://yuxiaoluo.github.io)
- Images:
![alt text](figures/img.png)
- Horizontal Rule/Page Break:
******
R Markdown provides an authoring framework for data science and can be used to:
-
save and execute code
-
generate high quality reports that can be shared with an audience
R Markdown documents are fully reproducible and support dozens of static and dynamic output formats. This 1-minute video should help you better understand its use.
R Markdown is free and open source and is integrated in the resourceful
R library, CRAN. Use the command to
install the rmarkdown
package in R.
install.packages("rmarkdown")
library(rmarkdown)
Then, you can navigate to the Menu bar and open a new R markdown file
following File
–> New File
–> R Markdown...
.
When you click the Knit button a document will be generated that
includes both content as well as the output of any embedded R code
chunks within the document. R Markdown generates a new file that
contains selected text, code, and results from the .Rmd
file. The new
file can be a finished web page, PDF, MS Word document, slide show,
notebook, handout, book, dashboard, package vignette or other format.
When you run render
, R Markdown feeds the .Rmd
file to knitr
,
which executes all of the code chunks and creates a new markdown (.md)
document which includes the code and its output. R Markdown can be
rendered to different formats and to more than one type of documents a
the same time by specifying one or more formats in the output argument.
You can either use the RStudio IDE Knit
button on the top bar of the
.Rmd
script window to render your .Rmd
file to other format, or
write therender
function in the RStudio console to do the same thing.
Set the output_format
argument of render
to render your .Rmd
file
into any of R Markdown’s supported formats. For example, the code below
renders 1-example.Rmd to a Microsoft Word document.
library(rmarkdown)
render("1-example.Rmd", output_format = "word_document")
If you do not select a format, R Markdown renders the file to its
default format, which you can set in the output field of a .Rmd
file’s
header. The header of a raw R Markdown script shows that it renders to
an HTML file by default. The list below shows the variable formats to
output with R Markdown.
-
html_notebook - Interactive R Notebooks
-
html_document - HTML document w/ Bootstrap CSS
-
pdf_document - PDF document (via LaTeX template)
-
word_document - Microsoft Word document (docx)
-
odt_document - OpenDocument Text document
-
rtf_document - Rich Text Format document
-
md_document - Markdown document (various flavor
Please also be aware of that this short list doesn’t include all the possible formats and you have other options, ex., github_document, which s GitHub Flavored Markdown document. You can also build books, websites, and interactive documents with R Markdown. For more details, you can check out the this lesson.
In R Markdown, you edit the R code in a code chunk. Ctrl+Alt+i
is the
shortcut for creating R code chunk. You can also create code chunks
executing other programming languages like Python, Bash, SQL, etc.
You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
It’s important to remember when you are creating an RMarkdown file that if you want to run code that refers to an object, for example:
print(dataframe)
you should include instructions showing what is the dataframe
, just
like a normal step in R script. For example:
A <- c("a", "a", "b", "b")
B <- c(5, 9, 85, 23)
dataframe <- data.frame(A, B)
print(dataframe)
## A B
## 1 a 5
## 2 a 9
## 3 b 85
## 4 b 23
If you are loading a dataframe from a .csv
file, you must include the
code in the .Rmd
dataframe <- read.csv("~/Desktop/Code/dataframe.csv")
You can also embed plots, for example:
Note that the echo = FALSE
parameter was added to the code chunk to
prevent printing of the R code that generated the plot.
Inserting a graph into RMarkdown is easy, the more energy-demanding aspect might be adjusting the formatting.
By default, RMarkdown will place graphs by maximizing their height, while keeping them within the margins of the page and maintaining aspect ratio. If you have a particularly tall figure, this can mean a really huge graph. In the following example we modify the dimensions of the figure we created above. To manually set the figure dimensions, you can insert an instruction into the curly braces:
A <- c("a", "a", "b", "b")
B <- c(5, 10, 15, 20)
dataframe <- data.frame(A, B)
print(dataframe)
## A B
## 1 a 5
## 2 a 10
## 3 b 15
## 4 b 20
boxplot(B~A,data=dataframe)
While R Markdown can print the contents of a dataframe easily by enclosing the name of the data frame in code chunk.
Another pleasing way for simple table formatting is kable()
in the
knitr
package. The first argument tells kable to make a table out of
the object dataframe
and that numbers hould have to significant
figures.
library(knitr)
## Warning: package 'knitr' was built under R version 4.0.3
kable(dataframe, digits = 2)
A | B |
---|---|
a | 5 |
a | 10 |
b | 15 |
b | 20 |
If you want to control the content in your table, you can use pander()
in the package pander
. Remember to load the package in code chunk as
well. The code below I want the 3rd column to appear in italics:
library(pander)
plant <- c("a", "b", "c")
temperature <- c(20, 20, 20)
growth <- c(0.65, 0.95, 0.15)
dataframe <- data.frame(plant, temperature, growth)
emphasize.italics.cols(3) # Make the 3rd column italics
pander(dataframe) # Create the table
plant | temperature | growth |
---|---|---|
a | 20 | 0.65 |
b | 20 | 0.95 |
c | 20 | 0.15 |
Alternatively, you may also create tables using Markdown syntax.
Using tidy()
from the package broom
, we are able to create tables of
our model outputs, and insert these tables into our markdown file. The
example below shows a simple example linear model, where the summary
output table can be saved as a new R object and then added into the
markdown file.
library(broom)
library(pander)
A <- c(20, 15, 10)
B <- c(1, 2, 3)
lm_test <- lm(A ~ B) # Creating linear model
table_obj <- tidy(lm_test) # Using tidy() to create a new R object called table
pander(table_obj, digits = 3) # Using pander() to view the created table, with 3 sig figs
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 25 | 4.07e-15 | 6.14e+15 | 1.04e-16 |
B | -5 | 1.88e-15 | -2.65e+15 | 2.4e-16 |
R Markdown is using knitr
to generate the .md
file. There are more
than 50 chunk options that can be used to fine-tune the behavior of
knitr when processing R chunks. Please refer to the online
documentation here for the full list
of options.
You can apply the chunk options to individual code chunks; You can also
apply any chunk options globally to a whole document, so you don’t have
to repeat the options in every single code chunk. To set chunk options
globally, call knitr::opts_chunk$set()
in a code chunk (usually the
first one in the document), like the first code chunk in this R Markdown
document.
knitr::opts_chunk$set(
comment = "#>", echo = FALSE, fig.width = 6
)
-
eval
: (TRUE
; logical or numeric) Whether to evaluate the code chunk. It can also be a numeric vector to choose which R expression(s) to evaluate, e.g.,eval = c(1, 3, 4)
will evaluate the first, third, and fourth expressions, andeval = -(4:5)
will evaluate all expressions except the fourth and fifth. -
echo
: (TRUE
; logical or numeric) Whether to display the source code in the output document. Besides TRUE/FALSE, which shows/hides the source code, we can also use a numeric vector to choose which R expression(s) to echo in a chunk, e.g., echo = 2:3 means to echo only the 2nd and 3rd expressions, and echo = -4 means to exclude the 4th expression. -
warning
: (TRUE
; logical) Whether to preserve warnings (produced by warning()) in the output. If FALSE, all warnings will be printed in the console instead of the output document. It can also take numeric values as indices to select a subset of warnings to include in the output. Note that these values reference the indices of the warnings themselves (e.g., 3 means “the third warning thrown from this chunk”) and not the indices of which expressions are allowed to emit warnings. -
include
: (TRUE
; logical) Whether to include the chunk output in the output document. If FALSE, nothing will be written into the output document, but the code is still evaluated and plot files are generated if there are any plots in the chunk, so you can manually insert figures later.
-
R for Health Data Science by Ewen Harrison and Riinu Pius
-
R Markdown Lesson from RStudio
-
Getting started with R Markdown by John
-
R Markdown: The Definite Guide by Yihui Xie, J. J. Allaire, and Garrett Grolemund