7 Writing Reports with R Markdown
R Markdown is a tool for creating documents that combine text, R code, and the results of that R code. It simplifies the process of incorporating graphs and other data outputs into a document, removing the need for separate R and word processing operations. It allows for the automation of data retrieval and updating, making it useful for maintaining up-to-date financial reports, among other applications. With R Markdown, you can produce documents in various formats, including HTML, PDF, and Word, directly from your R code. Markdown facilitates the formatting of text in a plain text syntax, while embedded R code chunks ensure the reproducibility of analysis and reports.
7.1 Creating an R Markdown Document
Here is the step-by-step guide to create a new R Markdown document in RStudio:
- Click on the top-left plus sign
, then select
R Markdown...
- In the dialog box that appears, select
Document
and choosePDF
, then clickOK
.

Figure 7.1: New R Markdown
- You should now see a file populated with text and code. Save this file by clicking
File
->Save As...
and select an appropriate folder. - To generate a document from your R Markdown file, click
Knit
:(or use the shortcut Ctrl+Shift+K or Cmd+Shift+K).
- Lastly, the
Knit
drop-down menulets you export your file in different formats, such as HTML or Word, in addition to PDF.
The R Markdown template includes:
- A YAML header, enclosed by
---
, which holds the document’s metadata, such as the title, author, date, and output format. - Examples of Markdown syntax, demonstrating how to use it.
- Examples of R code chunks, showing how to write and utilize them in your document.
The R code chunks are enclosed by ```{r}
at the beginning and ```
at the end, such as:
Anything written within these markers is evaluated as R code. On the other hand, anything outside these markers is considered text, formatted using Markdown syntax, and, for mathematical expressions, LaTeX syntax.
7.2 YAML Header
The YAML header at the top of the R Markdown document, enclosed in ---
, specifies high-level metadata and options that influence the whole document. It might look like this:
In this YAML header, the title
, author
, and date
fields define the title, author, and date of the document. The output
field specifies the output format of the document (which can be html_document
, pdf_document
, or word_document
, among others).
7.3 Markdown Syntax
Markdown is a user-friendly markup language that enables the addition of formatting elements to plain text documents. The following are some fundamental syntax elements:
- Headers:
#
can be used for headers. For instance,# Header 1
is used for a primary header,## Header 2
for a secondary header, and### Header 3
for a tertiary header, and so forth. - Bold: To make text bold, encapsulate it with
**text**
or__text__
. - Italic: To italicize text, use
*text*
or_text_
. - Lists: For ordered lists, use
1.
, and for unordered lists, use-
or*
. - Links: Links can be inserted using
[Link text](url)
. - Images: To add images, use

for online images or
for local images, wherepath
is the folder path to an image saved on your computer.
7.4 R Chunks
In R Markdown, you can embed chunks of R code. These chunks begin with ```{r}
and end with ```
. The code contained in these chunks is executed when the document is rendered, and the output (e.g., plots, tables) is inserted into the final document.
Following the r
in the chunk declaration, you can include a variety of options in a comma-separated list to control chunk behavior. For instance, ```{r, echo = FALSE}
runs the code in the chunk and includes its output in the document, but the code itself is not printed in the rendered document.
Here are some of the most commonly used chunk options:
echo
: If set toFALSE
, the code chunk will not be shown in the final output. The default isTRUE
.eval
: If set toFALSE
, the code chunk will not be executed. The default isTRUE
.include
: If set toFALSE
, neither the code nor its results are included in the final document. The default isTRUE
.message
: If set toFALSE
, suppresses all messages in the output. The default isTRUE
.warning
: If set toFALSE
, suppresses all warnings in the output. The default isTRUE
.fig.cap
: Adds a caption to graphical results. For instance,fig.cap="My Plot Caption"
.fig.align
: Aligns the plot in the document. For example,fig.align='center'
aligns the plot to the center.out.width
: Controls the width of the plot output. For example,out.width="50%"
will make the plot take up 50% of the text width.collapse
: IfTRUE
, all the code and results in the chunk are rendered as a single block. IfFALSE
, each line of code and its results are rendered separately. The default isFALSE
.results
: Theresults
argument provides options to control the display of chunk output in the final document. When set toresults='hide'
, the text output is concealed, whileresults='hold'
displays the output after the code. Additionally,results='asis'
allows direct inclusion of unmodified output, ideal for text or tables.results='markup'
formats output as Markdown, for seamless integration into surrounding text, particularly useful when the R output is written in Markdown syntax.results='verbatim'
displays the output as plain text, which is useful when the text includes special characters.fig.path
: Specifies the directory where the figures produced by the chunk should be saved.fig.width
andfig.height
: Specifies the width and height of the plot, in inches. For example,fig.width=6, fig.height=4
will make the plot 6x4 inches.dpi
: Specifies the resolution of the plot in dots per inch. For example,dpi = 300
will generate a high-resolution image.error
: IfTRUE
, any error that occurs in the chunk will stop the knitting process. IfFALSE
, errors will be displayed in the output but will not stop the knitting process.
Here’s an example:
```{r, echo=FALSE, fig.cap="Title", out.width = "50%", fig.align='center', dpi = 300}
plot(cars)
```
This chunk will create a plot, add a caption to it, set the width of the plot to 50% of the text width, align the plot to the center of the document, and output the plot with a resolution of 300 DPI. The actual R code will not be displayed in the final document.
Instead of specifying options for each code chunk, you can modify the default settings for all code chunks in your document using the knitr::opts_chunk$set()
function. For instance, I often include the following code at the start of an R Markdown document, right after the YAML header:
```{r}
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE,
fig.align = "center", out.width = "60%")
```
The aforementioned code modifies the default settings for all chunks in the document, as described below:
echo = FALSE
: Each chunk’s code will be omitted from the final document, a sensible practice for official documents, as recipients don’t require visibility of code used for graph creation.message = FALSE
: All messages generated by code chunks will be muted.warning = FALSE
: Warnings produced by code chunks will be silenced.fig.align = "center"
: All generated figures will be centrally aligned.out.width = "60%"
: The width of any generated figures will be set to 60% of the text width.
7.5 Embedding R Variables into Text
A key strength of R Markdown is the ability to incorporate R variables directly within the Markdown text. This enables a dynamic text where the values are updated as the variables change. You can accomplish this by using the `r variable`
syntax. Furthermore, you can format these numbers for enhanced readability.
To insert the value of an R variable into your text, you encase the variable name in backticks and prepend it with r
. Here’s an illustration:
To refer to this variable in your Markdown text, you can write the following text (outside of an R chunk):
The total amount is `r my_var` USD.
The output will be: “The total amount is 1.2323454^{5} USD.”
That’s because when the R Markdown document is knitted, `r my_var`
will be replaced by the current value of my_var
in your R environment, dynamically embedding the value of my_var
into your text.
Additionally, you can format numbers for better readability by avoiding scientific notation, rounding, and adding a comma as a thousands separator. To do this, you can use the formatC()
function in R as follows:
# R variable with formatting, defined inside R chunk
my_var_formatted <- formatC(my_var, format = "f", digits = 2, big.mark = ",")
Then, in your text:
The total amount is `r my_var_formatted` USD.
The output will be: “The total amount is 123,234.54 USD.”
In this case, format = "f"
ensures fixed decimal notation, digits = 2
makes sure there are always two decimal places, and big.mark = ","
adds comma as the thousand separator.
By properly formatting your numbers in your R Markdown documents, you enhance their clarity and make your work more professional and easier to read.
7.6 LaTeX Syntax for Math
LaTeX is a high-quality typesetting system that is widely used for scientific and academic papers, particularly in mathematics and engineering. LaTeX provides a robust way to typeset mathematical symbols and equations. Thankfully, R Markdown supports LaTeX notation for mathematical formulas, which is rendered in the HTML output.
In R Markdown, you can include mathematical notation within the text by wrapping it with dollar signs ($
). For example, $a^2 + b^2 = c^2$
will be rendered as \(a^2 + b^2 = c^2\).
Here are some basic LaTeX commands for mathematical symbols:
- Subscripts: To create a subscript, use the underscore (
_
). For example,$a_i$
is rendered as \(a_i\). - Superscripts: To create a superscript (useful for exponents), use the caret (
^
). For example,$e^x$
is rendered as \(e^x\). - Greek letters: Use a backslash (
\
) followed by the name of the letter. For example,$\alpha$
is rendered as \(\alpha\),$\beta$
as \(\beta\), and so on. - Sums and integrals: Use
\sum
for summation and\int
for integration. For example,$\sum_{i=1}^n i^2$
is rendered as \(\sum_{i=1}^n i^2\) and$\int_a^b f(x) dx$
is rendered as \(\int_a^b f(x) dx\). - Fractions: Use
\frac{numerator}{denominator}
to create a fraction. For example,$\frac{a}{b}$
is rendered as \(\frac{a}{b}\). - Square roots: Use
\sqrt
for square roots. For example,$\sqrt{a}$
is rendered as \(\sqrt{a}\).
If you want to display an equation on its own line, you can use double dollar signs ($$
). For example:
$$
\% \Delta Y_t
\equiv 100 \left( \frac{Y_t - Y_{t-1}}{Y_{t-1}}\right) \%
\approx 100 \left( \ln Y_t - \ln Y_{t-1} \right) \%
$$
This will be rendered as: \[ \% \Delta Y_t \equiv 100 \left(\frac{Y_t - Y_{t-1}}{Y_{t-1}}\right) \% \approx 100 \left( \ln Y_t - \ln Y_{t-1} \right) \% \tag{7.1} \]
LaTeX and R Markdown together make it easy to include mathematical notation in your reports. With practice, you can write complex mathematical expressions and equations using LaTeX in your R Markdown documents.
7.7 Printing Tables
The R packages kable
and kableExtra
are great tools for creating professionally formatted tables in your R Markdown documents. Directly printing data without any formatting is not usually advisable as it lacks professionalism and can often be challenging to read and interpret. By contrast, these packages allow you to control the appearance of your tables, leading to better readability and aesthetics.
You’ll first need to install and load the necessary packages. You can do so by executing install.packages(c("knitr", "kableExtra"))
in your console and then load the two packages in the beginning of your code:
Let’s assume we have a simple dataframe df
that we want to print:
df <- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Age = c(24, 30, 18),
Gender = c("Female", "Male", "Male")
)
df
## Name Age Gender
## 1 Alice 24 Female
## 2 Bob 30 Male
## 3 Charlie 18 Male
You can create a basic table using the kable
function from the knitr
package:
Name | Age | Gender |
---|---|---|
Alice | 24 | Female |
Bob | 30 | Male |
Charlie | 18 | Male |
This will generate a simple, well-formatted table. However, you can further customize the table’s appearance using functions from the kableExtra
package:
Name | Age | Gender |
---|---|---|
Alice | 24 | Female |
Bob | 30 | Male |
Charlie | 18 | Male |
This code generates a striped table, which alternates row colors for easier reading. The full_width = FALSE
argument ensures the table only takes up as much width as necessary.
Adding a caption to your table is straightforward. Simply provide the caption
argument to the kable
function:
Name | Age | Gender |
---|---|---|
Alice | 24 | Female |
Bob | 30 | Male |
Charlie | 18 | Male |
This code generates the same striped table, but now with a caption: “A table of sample data.”
These are just the basics. Both kable
and kableExtra
provide numerous options for customizing your tables. I encourage you to explore their documentation and experiment with different settings.
7.8 Summary and Resources
R Markdown provides a powerful framework for dynamically generating reports in R. The “dynamic” part of “dynamically generating reports” means that the document is able to update automatically when your data changes. By understanding and effectively using Markdown syntax, R code chunks, chunk options, and YAML headers, you can create sophisticated, reproducible documents with ease like the document you are currently reading.
For an in-depth understanding of R Markdown, you may want to delve into R Markdown: The Definitive Guide, an extensive resource on the topic. Additionally, DataCamp’s course Reporting with R Markdown provides practical lessons on how to create compelling reports using this tool.