Ideas around pandoc #11

maelle · 2018-10-04T09:11:39Z

Ideas from @baptiste on Twitter

"just wondering though – have you considered embedding this step as part of a pandoc filter toolchain, as an alternative to the pandocfilters package? Would allow processing the AST in R with the full power of xml2 etc., but the xml_md return step would not be necessary.

and in fact, i believe that with such a toolchain an alternative to knitr would process chunks once they've been parsed into xml, and update the AST with the results. (i suggested this a while back, but as Yihui said this wasn't possible with pandoc back when knitr started)

... one advantage being a more robust handling of inline code, which is currently extracted by regexs in knitr. Having the full structured AST before running code chunks also allows greater flexibility for pre- and post-processing with custom markup, etc.

this knitr alternative may help with the elusive un-knit function to merge changes done to the output: since chunks and inline code are tagged as such in the input AST, they can be filtered out when diff-ing the output AST and its commented version containing the tracked changes."

maelle · 2018-10-04T09:14:17Z

R package pandocfilters https://cran.r-project.org/web/packages/pandocfilters/index.html 👀

maelle · 2018-10-04T09:20:35Z

@baptiste I'm not sure I understand how one would go from XML to md? Via pandoc?

Are you interested in helping write a minimal working example?

noamross · 2018-10-04T12:06:32Z

Pandoc represents its AST in internal structures, which can be manipulated via Haskell or Lua. It makes the tree available to other programs as JSON, so to do this you'd either want to convert the JSON to an R list (as the R package does), convert it to XML, or work with it via jq or some other JS processor.

Looks like there's a Haskell example here: https://github.com/cdupont/R-pandoc

maelle · 2018-10-04T12:08:56Z

so you wouldn't convert the (R)md to XML first?

noamross · 2018-10-04T15:04:07Z

It comes down to a couple of things: First, if you want pandoc extensions in you markdown, and second, whether Rmd markup, which has some stuff that isn't exactly markdown, survives the conversion. It seems that Rmd chunk headers and inline code survives with the header and initial r just prepended to the code block when using pandoc, not sure about cmark. After that it's a matter of what format is the most amenable to modifying - JSON, an R List, or XML. XML via xpath is really powerful, but you might prefer the others.

baptiste · 2018-10-04T18:25:19Z

@maelle i'm keen, but broke my right arm last weekend so typing is a bit of a struggle

baptiste · 2018-10-04T18:49:43Z

I think a first step would be to make a minimally-interesting dummy Rmd example, and run it through

knitr
cmark
pandoc

to have specific ASTs to inspect in the form of R list, json, xml, to fully compare their features.

The next step would be to mimic the knitting step by isolating from the input AST those code bits that need to be run (lots of details to consider here, but knitr has it well figured out).

Last step is merging the output produced with the AST.
From there I think pandoc is the most natural tool, as it allows many output formats.

The idea of merging "track-changes" made to an output manuscript would be a variation on this, where in merging changes to the AST one would also look at a diff of the text nodes.

maelle · 2018-10-10T09:23:46Z

@baptiste I am very sorry that you broke your right arm 😱

I haven't had a chance to look at this yet but hope to do it soon.

maelle mentioned this issue Oct 4, 2018

Notes on unknitting #7

Open

maelle added the enhancement ✨ New feature or request label Oct 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas around pandoc #11

Ideas around pandoc #11

maelle commented Oct 4, 2018

maelle commented Oct 4, 2018

maelle commented Oct 4, 2018

noamross commented Oct 4, 2018

maelle commented Oct 4, 2018

noamross commented Oct 4, 2018

baptiste commented Oct 4, 2018

baptiste commented Oct 4, 2018 •

edited

Loading

maelle commented Oct 10, 2018

Ideas around pandoc #11

Ideas around pandoc #11

Comments

maelle commented Oct 4, 2018

maelle commented Oct 4, 2018

maelle commented Oct 4, 2018

noamross commented Oct 4, 2018

maelle commented Oct 4, 2018

noamross commented Oct 4, 2018

baptiste commented Oct 4, 2018

baptiste commented Oct 4, 2018 • edited Loading

maelle commented Oct 10, 2018

baptiste commented Oct 4, 2018 •

edited

Loading