-
Notifications
You must be signed in to change notification settings - Fork 0
/
03_09_rmarkdown.Rmd
366 lines (246 loc) · 13 KB
/
03_09_rmarkdown.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
---
title: "Session Nine"
author: "Akos Mate"
subtitle: "Creating reports with RMarkdown"
date: '2018 July'
output:
html_document:
toc: true
toc_depth: 3
theme: readable
css: style.css
bibliography: mybib.bib
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
comment = NA,
collapse = TRUE,
warning=FALSE)
```
> Main packages used: `knitr`, `stargazer`, `rmarkdown`
> Main functions covered: `stargazer::stargazer()`, `knitr::cable()`, `knitr::opts_chunk$set()`
> Supplementary resources:
> - [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/),
> - [R markdown cheat sheet](https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf),
> - [RStudio - Bibliographies and Citations](https://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html),
> - [RStudio RMarkdown webinar](https://rmarkdown.rstudio.com/lesson-1.html),
> - [Create Awesome HTML Table with knitr::kable and kableExtra](https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html)
In this session the focus is on getting all the stuff out of R that we've been doing so far. We will cover how to create html (such as this), pdf (like a LaTeX document) or Word output from R, or how to get just individual results, such as regression tables. Communicating research is a fundamental part of the (academic) research process, so this session will look at all the ways we can extract info from R.
# 1. RMarkdown
Markdown is a simple, easy to read and easy to write language that was created initially as a text-to-HTML tool. The Markdown syntax is straightforward and easy to memorize. Let's take a look at the basics.
## 1.1 Basic formatting
Adding headers
```{r markdown_syntax}
# Header 1
## Header 2
### Header 3
#### Header 4
##### Header 5
###### Header 6
```
To add bold, italics or their combination:
```
*italics* or _italics_
**bold** or __bold__
```
*italics* or _italics_
**bold** or __bold__
Add linebreaks with two empty spaces + enter. A single enter just adds a newline.
Create lists easily:
```
unordered list
- item1
- item2
+ subitem1
+ subitem2
or
* item1
* item2
+ subitem1
+ subitem2
ordered lists
1. first item
2. second item
2.1 subitem
2.2 subitem
```
nordered list
- item1
- item2
+ subitem1
+ subitem2
or
* item1
* item2
+ subitem1
+ subitem2
ordered lists
1. first item
2. second item
+ sub item
Add images:
`![](https://upload.wikimedia.org/wikipedia/en/thumb/b/b9/MagrittePipe.jpg/300px-MagrittePipe.jpg)`
![](https://upload.wikimedia.org/wikipedia/en/thumb/b/b9/MagrittePipe.jpg/300px-MagrittePipe.jpg)
## 1.2 setting up R Markdown[^1]
[^1]: Examples for this section are adapted from the [R for Data Science, ch.27](http://r4ds.had.co.nz/r-markdown.html)
You can create a new R Markdown document from the `File > New File > R Markdown...` route. The new document you'll have is essentially a plain text file, with an `.Rmd` extension. R Markdown allows us to interweave text, code and results in one document.
The main elements of our document are:
1. The YAML header at the beggining of the doc, between the `---` lines.
2. Code chunks, marked by ```` ``` ````
3. Text with markdown formatting
Knitting works that your `.Rmd` file is being sent to the `knitr` package, which then executes all your code chunks and then `pandoc` renders the output in your desired format.
```{r create_rmd, out.width = "400px", echo=FALSE}
knitr::include_graphics("rmd_new.PNG")
```
A short document looks like this. We will go over each element and then use our prevous sessions to write a short mock report.
```{r example1, echo=FALSE}
cat(htmltools::includeText("ex1.rmd"))
```
> **Quick excercise:** create a new R Markdown document and see what the output of the above code is. You can run the Rmarkdown document by "knitting" it with the Knit button.
### 1.2.1 chunk names
Code chunks are the backbone of your document, they contain the R code that you would write in your script. You can also embed code inline, with ```` ` ` ````, which will look like this: `dim(df)`. Each chunk can have different options, which you can specify in the top of the chunk like this: ```` ```{r, options here} ````
You can also name your chunks by adding the name on the top: ```` ```{r chunk_name}````. It is a useful practice because if you have a chunk with some error in it, you know where to look for, after checking the error message. The bottom of the script window also allows you to navigate between chunk by using their names.
```{r, out.width = "400px", echo=FALSE}
knitr::include_graphics("chunk_name.PNG")
```
### 1.2.2 chunk options
You can specify options for each chunk (or set up a global default) which will controll how `knitr` will run the code inside. The most useful options:
- `eval = TRUE/FALSE` When `FALSE` it'll only display code, not the output, as the code inside the chunk will not be evaluated and run. If you just want to show your code, without the results this is useful.
- `echo = TRUE/FALSE` When `TRUE` it will show both your code and the output below.
- `warning` and `error` when `TRUE` will display the error and warning messages alongside your results. Useful if you have long warnings for some reasons and you do not want to clutter the results.
- `message` same as the previous, but with messages. (e.g.: what you see after loading packages)
You can set global options for your document with the following line in a code chunk: `knitr::opts_chunk$set()`. For example the defaults that I used for these outputs is the following:
```{r, include=TRUE, eval=FALSE}
knitr::opts_chunk$set(echo = TRUE,
comment = NA,
collapse = TRUE,
warning = FALSE)
```
> Quick excercise: Try out different chunk options with the following code:
```{r, eval=FALSE}
library(ggplot2)
library(dplyr)
diamonds_small <- diamonds %>%
filter(carat <= 1) %>%
ggplot(aes(carat, price)) +
geom_point()
diamonds_small
```
> You can slice the code into separate chunks if you want. Don't be afraid to experiment, try out some combinations, get a feel for manipulating chunks. Try setting up a global option for figure sizes, using the `fig.width` and `fig.height` options. The default value is `7` for them.
### 1.2.3 YAML header
In this, you can specify the attributes for your documents (similary to the LaTeX preambulum).
The YAML header for this document looks like this:
````
---
title: "Session Nine"
author: "Akos Mate"
subtitle: "Creating reports with RMarkdown"
date: '2018 July'
output:
html_document:
toc: true
toc_depth: 3
bibliography: bibliography.bib
---
````
Most of the things are self explanatory (such as title, author, etc.), but there are some options under `output` that are worth exploring. You should also mind the indentation of the header elements, because it matters!
The `output:` in this case is a `html_document`, with table of contents enables (`toc: true`) with displaying 3 levels (`toc_depth: 3`). The table of content automatically pulls your markdown headers (`#`, `##`, etc.). You can switch between outputs in two ways:
* use the `output: pdf_document` in the YAML header
* use the knit drop down menu to choose your output
```{r, out.width = "250px", echo=FALSE}
knitr::include_graphics("knit_pdf.PNG")
```
Possible output options are:
* `pdf_document` creates a pdf doc, using LaTeX. If you always wanted to try LaTeX but found it too complicated, this is an easy way to create professional looking papers, without going into the LaTeX nitty gritties (eventually you'll have to I'm afraid). You need to install LaTeX for this feature.
* `word_document` creates Microsoft Word docs with `.docx` extension
* `odt_document` creates OpenDocument Texts with `.odt` extension
* `rtf_document` creates Rich Text Format with `.rtf` extension
The `bibliography` is one of the key argument if you are writing academic papers. For this to work, you'll need a BibTeX file (with `.bib` extension), which is essentially a plain text file with your bib citation. If you use (you should!!!!) any citation manager, there is an option to export your citations into a Bib file. A bibtex formatted citation looks like this:
```
@article{albrecht1999time,
title={Time varying speed of light as a solution to cosmological puzzles},
author={Albrecht, Andreas and Magueijo, Joao},
journal={Physical Review D},
volume={59},
number={4},
pages={043516},
year={1999},
publisher={APS}
}
```
You can get this type of citation from Google Scholar as well. After you prepared your `.bib` file, you just need to specify it in the YAML header as such: `bibliography: mybib.bib`.
* To insert the citation into the paper, you need to use the following syntax: `[@bibkey]` where the bib key is the identifier in the `@article{bibkey, ...}`. In our case, it is "albrecht1999time".
* To cite this seminal contribution to science, we type: `[@albrecht1999time]` which will give us this: [@albrecht1999time].
* For in text citation, just use `@albrecht1999time`: @albrecht1999time
* Supress the author by adding a `-`: `[-@albrecht1999time]`: Albrecht et al [-@albrecht1999time] demonstrated, that because of physics!
You can add the bibliography at the end of your paper with the `# Bibliography` header. With this we are mostly set to write great papers without exiting from our R workflow.
## 1.3 Tables and other output
We will see how to get our R things into html, LaTeX, and Word.
### 1.3.1 html
```{r, message=FALSE}
library(dplyr)
library(knitr)
library(kableExtra)
library(survey)
library(broom)
library(stargazer)
```
If you need to create html versions of your research (for your blog for example), best to use the `knitr::kable()` function. The code which generates nice tables. If you wish, you can add further nice little extras with the `kableExtra` package. Let's add stripes to our table and highlight the row where our mouse is with the `kable_styling()` function. If left empty, it will give you a the output of the `kable()` function.
```{r}
df <- mtcars[1:5, 1:6]
```
```{r}
df %>%
kable() %>%
kable_styling()
```
The more fancy version:
```{r}
kable(df) %>%
kable_styling(bootstrap_options = c("striped", "hover"))
```
You can export your regression tables in similar fashion after tidying it up with the `broom` package.
```{r}
data("airquality")
m1 <- tidy(lm(Ozone~Temp+Solar.R, data = airquality))
reg1_table <- m1 %>%
select(IV = term, Est. = estimate, sd = std.error, `p value` = p.value) %>%
mutate_if(is.numeric, funs(round(., 2)))
```
```{r}
reg1_table %>%
kable() %>%
kable_styling()
```
<br>
<br>
Or using the `stargazer` package, which is producing a more journal like output.
```{r}
m2 <- lm(Ozone~Temp+Solar.R, data = airquality)
```
```{r, results = "asis"}
stargazer(m2, title = "Regression result", dep.var.labels = "Ozone levels", type = "html")
```
<br>
<br>
To have the output render, you need to set your chunk options the following: ```` ```{r, results = "asis"} ````. Otherwise you'll just get the html code, that you can paste into any html file to have it rendered:
```{r}
stargazer(m2, title = "Regression result", dep.var.labels = "Ozone levels", type = "html")
```
### 1.3.2 Word
There are a number of ways to export to Word. You can just simply use `kable()` and knit to Word, which will give you a table output in Word that you can format any way you like. Another option is you export the R table as a csv and then import it into Word by opening the csv file, then selecting the imported text, `Insert > Table > Convert Text to Table` and here Word should automatically recognize what it should do.
```{r}
#write.table(reg1_table, file = "reg1_table.csv", sep = ",")
```
Otherwise, you can open the .csv in Excel and then copy the Excel table into Word.
### 1.3.3 LaTeX
If you write in LaTeX, or knit the R markdown doc into pdf, you need LaTeX table output. Fortunately, the `stargazer` package is rather flexible in that regard, as you only have to specify `type = latex` to get the LaTeX output.
```{r}
stargazer(m2, title = "Regression result", dep.var.labels = "Ozone levels", type = "latex")
```
I highly suggest that you give the pdf output a go, or give LaTeX a try, as it produces beautiful, highly customizable and professional output. The output of the above code looks like this:
```{r, out.width = "300px", echo=FALSE}
knitr::include_graphics("regtable_latex.PNG")
```
An alternative package for both the html and LaTeX output is `xtable`, which have similar functionality as `stargazer`.
# Bibliography