Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slide break not recognised when rendered with the Knit button #332

Open
3 tasks done
charliejhadley opened this issue Sep 6, 2021 · 8 comments
Open
3 tasks done
Labels
bug remark.js RStudio IDE related directly to RStudio IDE

Comments

@charliejhadley
Copy link

charliejhadley commented Sep 6, 2021

This is a weird bug, it occurs only in {xaringan} slides when RStudio's "Knit" button is used. Please follow these steps for a reprex:

  • Create the following .Rmd file and save it into an RStudio project.
---
title: "Works in moonreader, but not knitted"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
maps::world.cities
```

---

## Slide with code

Eg `foobar` 
  • When knit the output HTML document will have a different number of slides based on how it was knitted.
How knitted What happens
Using RStudio's "Knit" button (tested with latest version 1.4.1717) The output document has 2 slides, the slide break is ignored
Using xaringan:::inf_mr() The output document has 3 slides (which it should have)
Using `rmarkdown::render("the-file.Rmd") The output document has 3 slides (which it should have)

It appears the following circumstances lead to this bug:

  • The YAML option df_print is set to paged
  • A slide contains a code chunk that prints a data.frame
  • Any slide afterwards contains either inline code or a code chunk.

The bug does not occur if slides after the slide with a printed data.frame do not contain code, eg this .Rmd file is unaffected:

---
title: "Works regardless"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
maps::world.cities
```

---

## Slide without code

Eg the quick brown fox....

While there is already a fix - using rmarkdown::render() - I would argue that most RMarkdown user's either click this button or use the Cmd+Shift+K keyboard shortcut which does the same thing.

I also acknowledge that this bug might be better placed against the RStudio IDE instead of {xaringan}, I'm happy to re-report this elsewhere if you would prefer.

xfun::session_info('xaringan')

> xfun::session_info('xaringan')
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS  11.5.2, RStudio 1.4.1717

Locale: en_GB.UTF-8 / en_GB.UTF-8 / en_GB.UTF-8 / C / en_GB.UTF-8 / en_GB.UTF-8

Package version:
  base64enc_0.1.3  digest_0.6.27    evaluate_0.14    fastmap_1.1.0    glue_1.4.2       graphics_4.0.3  
  grDevices_4.0.3  highr_0.9        htmltools_0.5.2  httpuv_1.6.2     jsonlite_1.7.2   knitr_1.33      
  later_1.3.0      magrittr_2.0.1   markdown_1.1     methods_4.0.3    mime_0.11        promises_1.2.0.1
  R6_2.5.1         Rcpp_1.0.7       rlang_0.4.11     rmarkdown_2.10   servr_0.23       stats_4.0.3     
  stringi_1.7.4    stringr_1.4.0    tinytex_0.33     tools_4.0.3      utils_4.0.3      xaringan_0.22   
  xfun_0.25        yaml_2.2.1    

By filing an issue to this repo, I promise that

  • I have fully read the issue guide at https://yihui.org/issue/.
  • I have provided the necessary information about my issue.
    • If I'm asking a question, I have already asked it on Stack Overflow or RStudio Community, waited for at least 24 hours, and included a link to my question there.
    • If I'm filing a bug report, I have included a minimal, self-contained, and reproducible example, and have also included xfun::session_info('xaringan'). I have upgraded all my packages to their latest versions (e.g., R, RStudio, and R packages), and also tried the development version: remotes::install_github('yihui/xaringan').
    • If I have posted the same issue elsewhere, I have also mentioned it in this issue.
  • I have learned the Github Markdown syntax, and formatted my issue correctly.

I understand that my issue may be closed if I don't fulfill my promises.

@yihui
Copy link
Owner

yihui commented Sep 7, 2021

That sounds like a really weird bug. @cderv Could you take a look? Thanks!

@cderv
Copy link
Collaborator

cderv commented Sep 8, 2021

Wow, this is indeed a weird bug!

  • I can reproduce with last daily version 2021.11.0-daily+26 so this is not great.
  • I confirm that the cause seems to be df_print: paged when following slide contains code chunk

This only happens in RStudio IDE so I believe it has something to do with what the IDE is doing with paged table. I believe some hooks are used inside the IDE directly to replace R functions. This could lead to differences in behavior between Knitting in the IDE and Knitting at command line.

Thanks for the great report @charliejhadley !! It helps.

We'll need to investigate more now - and yes this is an issue with RStudio IDE not xaringan I believe.

@cderv cderv added bug RStudio IDE related directly to RStudio IDE labels Sep 8, 2021
@cderv
Copy link
Collaborator

cderv commented Sep 8, 2021

To investigate further, I compared the different output and found a difference.

How rendering works with paged table ?

Small reminder on how this works:

  • When using paged table, this will output raw HTML in the markdown.
  • When using xaringan, the raw HTML is now then processed by Pandoc but included into the markdown <textarea> of the HTML file and parsed by remark.js parser to be rendered to HTML.

Differences between IDE rendering and Console rendering

Regarding the differences between IDE rendered and Console rendered

  • When Knitting in the IDE, the HTML table included is longer than the one included when Knitting. The former has 10000 values included, and the later has 1000 values included.
  • You can see that at the bottom of the paged table generated:
    • When knitting in IDE:
      image
    • When rendering in console:
      image
  • It seems this could lead to an issue with remark.js markdown parser which "breaks" when the HTML table inserted is too long.

I can confirm that slicing the data.frame before printing will solve this.

---
title: "Works regardless"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
utils::head(maps::world.cities, 1000)
```

---

## Slide with code

Eg `foobar` 

One thing to note here is that, world.cities is big data.frame

nrow(maps::world.cities)
#> [1] 43645

Usually, it is not a good idea to print the whole data.frame in an HTML document - it could cause performance issues. It is why by default, rmarkdown::paged_table_html() will print a maximum of row defined by option max.print (documented in https://bookdown.org/yihui/rmarkdown/html-document.html#paged-printing), which will default to 1000 if unset. (https://github.com/rstudio/rmarkdown/blob/0af6b3556adf6e393b2da23c66c695724ea7bd2d/R/html_paged.R#L101)
There is also a hard max to 10000 to prevent Pandoc from failing I believe: https://github.com/rstudio/rmarkdown/blob/0af6b3556adf6e393b2da23c66c695724ea7bd2d/R/html_paged.R#L108

Obviously, using it directly solves also the issue as this works.

---
title: "Works in moonreader, but not knitted"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r, max.print = 1500}
maps::world.cities
```

---

## Slide with code

Eg `foobar` 

Regarding max.print value

I believe the IDE is also setting/checking these options
https://github.com/rstudio/rstudio/blob/26a10a6b1151474b5b447492a5716e0e3ff06db2/src/cpp/r/R/Options.R#L116-L124

Default for max.print is 99999 (see ?options). So, by default, it would trigger the 10000 max row.
But behavior is also different regarding this options...

Test file

---
title: "Works regardless"
output: html_document
---

```{r}
getOption('max.print')
```
  • Rendering using Knit button in IDE with html_document, we get a value of 99999
  • Running the code chunk in the IDE, we get a value of 1000 - this seems to prove that the IDE sets the option.
  • Which is confirmed when running getOption("max.print") in R console in the IDE
  • Rendering using rmarkdown::render(), we get a value of 1000 - I believe because the value of max.print set by the IDE is used.
  • Rendering in a background process (e.g using callr or xfun::Rscript_call()) get us the default value of 99999

I believe this is why we get this different behavior depending on how the document is rendered.

Regarding remark.js parsing

In addition to the max.print value, I believe there is something with remark.js parser when this long table is inserted and following slide contains code chunk, as found and detailed in this thread. Maybe there is a character issue in some line of the table that is causing the issue with the parse ?
This would be trickier to debug and another story.

So what is the issue and what could we do ?

I did not try yet with other data to confirm that the long HTML table is the issue. Maybe it is something with the content of this HTML table with a character which is not escaped correctly causing remark.js parser to break.
This could also be the cause.

@yihui what do you think is the pass forward ?

  • Should we look into upstream issue with remark.js to understand why it break ?
  • Should we rethink / complement this max.print issue, specifically in xaringan ? (table will probably be small in xaringan)
  • Does this analysis give you some more ideas ?

@cderv
Copy link
Collaborator

cderv commented Sep 8, 2021

Maybe it is something with the content of this HTML table with a character which is not escaped correctly causing remark.js parser to break. This could also be the cause.

Just to confirm my hunch I looked at this hypothesis, and I believe this is what is causing the issue: The world.cities dataset contains a cityname with a backtick character (`) which I believe is causing the error.

The city is question is Bani Bu `Ali , row 3238

If we try to remove it, we have 3 slides as expected

---
title: "Works regardless"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
maps::world.cities[-3238,]
```

---

## Slide with code

Eg `foobar` 

@charliejhadley I believe this is an issue of character escaping, as xaringan uses remark.js which will parse the content of the body as Markdown.

Can you confirm this works for you?

@charliejhadley
Copy link
Author

Ahh, great to see you've discovered what was causing this @cderv! Yup, knocking out row 3238 makes the Knit button behave the same as rmarkdown::render().

You've probably already realised there's a more minimal reprex now, but just in case here you go:

---
title: "Knit button weirdness"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
data.frame(
  names = c("foo", "foo `")
)
```

---

## Slide with code

Eg `foobar` 

@cderv
Copy link
Collaborator

cderv commented Sep 8, 2021

yes thanks for sharing this !

I knew I have encountered this before and I think this issue is in a way a duplicate of #272

Same type of issue with the parser

@cderv cderv added remark.js RStudio IDE related directly to RStudio IDE and removed RStudio IDE related directly to RStudio IDE labels Sep 8, 2021
@charliejhadley
Copy link
Author

Ooops @cderv I made a mistake! That's not actually a reprex for the issue I first reported because that .Rmd behaved identically for the Knit button and rmarkdown::render().

This is a slightly simpler, true reprex of my initial issue, and it demonstrates there is something very specific about the order of the vector maps::world.cities$name that is causing the issue.

---
title: "Knit button misbehaves"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
cities <- maps::world.cities$name
positions_of_backticks <- which(grepl("`", cities))

data.frame(
  names = c(
    cities[1:{positions_of_backticks[1] - 1}],
    "Bani Bu `Ali",
    cities[{positions_of_backticks[1] + 1}:{positions_of_backticks[2] - 1}],
    "Wad-il-Ma`awil",
    cities[{positions_of_backticks[2] + 1}:{positions_of_backticks[3] - 1}],
    "al-`Amarat",
    "al-`Awabi",
    cities[{positions_of_backticks[4] + 1}:{length(cities)}]
  )
)
```

---

## Slide with code

Eg `foobar` 

@cderv
Copy link
Collaborator

cderv commented Sep 9, 2021

I believe the differences comes from the max.print value which differ when rendering inside RStudio with Knit button (and background session) or in current session in Console. I would say: is the backtick before 1000 rows or after ?

There are two issues:

  1. Backtick non HTML escaped in the body of the document will make remark.js fail
  2. It happens when this is in a table, depending on the number of rows included in the output. If included, the error 1 will happen. max.print has not the same value depending on how render is done.

No backticks - Works in both Button and Console

---
title: "Knit button misbehaves"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
cities <- maps::world.cities$name
positions_of_backticks <- which(grepl("`", cities))

maps::world.cities[-positions_of_backticks, ]
```

---

## Slide with code

Eg `foobar` 

backtick name at the 1001st row - Do not work with Button but works in Console

---
title: "Knit button misbehaves"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
cities <- maps::world.cities$name
positions_of_backticks <- which(grepl("`", cities))

df <- maps::world.cities[-positions_of_backticks, ]

df[1001, ]  <- maps::world.cities[positions_of_backticks[1], ]

df
```

---

## Slide with code

Eg `foobar` 

Backtick name at the 999th row - Does not work in both Button and Console

---
title: "Knit button misbehaves"
output:
  xaringan::moon_reader:
    df_print: paged
---

## Slide with paged df

```{r}
cities <- maps::world.cities$name
positions_of_backticks <- which(grepl("`", cities))

df <- maps::world.cities[-positions_of_backticks, ]

df[999, ]  <- maps::world.cities[positions_of_backticks[1], ]

df
```

---

## Slide with code

Eg `foobar` 

Is there more than that for you ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug remark.js RStudio IDE related directly to RStudio IDE
Projects
None yet
Development

No branches or pull requests

3 participants