-
Notifications
You must be signed in to change notification settings - Fork 0
/
pandocGmi.html
111 lines (111 loc) · 9.7 KB
/
pandocGmi.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang xml:lang>
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<meta name="author" content="Nick James" />
<title>Emitting gemtext with Pandoc</title>
<style type="text/css">
/* code{white-space: pre-wrap;} */
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
</style>
<link rel="stylesheet" href="data:text/css,body%7Bbackground%2Dcolor%3A%20LightYellow%3Bmax%2Dwidth%3A%2030em%3Bmargin%2Dleft%3A%203em%3Bmargin%2Dright%3A%20auto%3Btext%2Dalign%3A%20justify%3B%7Dh1%2C%20h2%7Bfont%2Dsize%3A%20large%3B%7Dh3%2C%20h4%7Bfont%2Dsize%3A%20medium%3B%7Dpre%2C%20pre%2Eprogramlisting%2C%20pre%2Esynopsis%20%7Bbackground%2Dcolor%3A%20%23FFFFFF%3Bborder%3A%201px%20solid%20%23DDDDDD%3Bcolor%3A%20%23000000%3Bfloat%3A%20left%3Bline%2Dheight%3A%201%3Bmargin%2Dtop%3A%200%2E1em%3Bmin%2Dwidth%3A%2090%25%3Bpadding%3A%201em%201em%201%2E5em%3B%7Dp%2C%20h1%2C%20h2%2C%20h3%20%2C%20h4%2C%20dl%7Bclear%3A%20both%3B%7Ddiv%2EinsertedFile%7Bborder%3A%201px%20dotted%20darkgray%3Bpadding%2Dleft%3A%201em%3Bpadding%2Dright%3A%201em%3B%7Da%3Alink%20%7Bcolor%3A%20%23005000%3B%7Da%3Avisited%20%7Bcolor%3A%20%23A05010%3B%7Da%3Afocus%20%7Bbackground%2Dcolor%3A%20%23EEEE99%3Bcolor%3A%20%23009900%3B%7Da%3Ahover%20%7Bbackground%2Dcolor%3A%20%23EEEE99%3Bborder%3A%201px%20dashed%20%23DDDDDD%3Bcolor%3A%20%23009900%3B%7Da%20%7Bborder%3A%201px%20dotted%20%23DDDDDD%3Btext%2Ddecoration%3A%20none%3B%7Dul%7Blist%2Dstyle%2Dtype%3A%20square%3B%7Dnav%23TOC%20ul%7Blist%2Dstyle%2Dtype%3A%20none%3Bpadding%2Dleft%3A%202em%3B%7Dnav%23TOC%3Eul%7Bpadding%2Dleft%3A%200%3B%7Dh1%2C%20h2%7Bpadding%2Dtop%3A%201em%3B%7Dtable%7Bwidth%3A%2040em%3Btext%2Dalign%3A%20left%3Bvertical%2Dalign%3A%20top%3Bmargin%2Dtop%3A%201em%3Bmargin%2Dbottom%3A%201em%3B%7Dtbody%7Bvertical%2Dalign%3A%20top%3B%7Dth%7Bborder%3A%201px%20solid%20grey%3Bpadding%3A%20%2E3em%3B%7Dtd%7Bborder%3A%201px%20dotted%20grey%3Bpadding%3A%20%2E3em%3B%7Dtr%2Eeven%7Bbackground%2Dcolor%3A%20PaleGreen%3B%7Drow%5Bparity%3D%22even%22%5D%7Bbackground%2Dcolor%3A%20PaleGreen%3B%7Dp%7Bmargin%2Dtop%3A%20%2E25em%3Bmargin%2Dbottom%3A%20%2E25em%3B%7D%2Ewarning%7Bcolor%3A%20red%3Bfont%2Dsize%3A%20large%3B%7Ddiv%2EYGrammar%7Btext%2Dalign%3A%20left%3B%7D">
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<header>
<h1 class="title">Emitting gemtext with Pandoc</h1>
<p class="author">Nick James</p>
</header>
<nav id="TOC">
<ul>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#gemtext">Gemtext?</a></li>
<li><a href="#pandoc">Pandoc?</a></li>
<li><a href="#how">How?</a></li>
<li><a href="#omg-the-outputs-horrible">OMG the output’s horrible</a></li>
<li><a href="#so-what-are-you-going-to-do-about-it">So what are you going to do about it?</a></li>
<li><a href="#scratching-the-itch">Scratching the itch</a>
<ul>
<li><a href="#lua">Lua</a></li>
<li><a href="#the-template">The Template</a></li>
</ul></li>
<li><a href="#hacking-lua">Hacking lua</a></li>
<li><a href="#installation">Installation</a></li>
<li><a href="#bugs-c">Bugs &c</a></li>
<li><a href="#other-software">Other Software</a></li>
</ul>
</nav>
<h2 id="introduction">Introduction</h2>
<p>Whilst faffing with <a href="https://gemini.circumlunar.space/docs/specification.gmi">gemtext</a> it ocurred to me that <a href="https://pandoc.org/">pandoc</a> could do the heavy lifting when it came to generating .gmi files.</p>
<p>Still interested? Read on!</p>
<h2 id="gemtext">Gemtext?</h2>
<p>Gemini is a <a href="https://gemini.circumlunar.space/">recent proposal</a> for a lightweight web protocol with a mime type of text/gmi. Gemtext is, by design, childishly simple to write by hand, but it’s another story if you have reams of stuff you want to put in geminispace. Hence the interest in converting to text/gmi.</p>
<h2 id="pandoc">Pandoc?</h2>
<p>Pandoc is a well established document converter, typically used to get from markdown to html. /However/ it takes lots of different <a href="https://pandoc.org/MANUAL.html#general-options">formats</a> as input and output. Its input is converted into an internal representation which is emitted in the users choice of output format. Critically, it also provides facilities for emitting custom formats (see -F PROGRAM, –filter=PROGRAM under <a href="https://pandoc.org/MANUAL.html#reader-options">Reader options</a>). One of these <a href="https://pandoc.org/MANUAL.html#custom-writers">facilities</a> enables you to use <a href="https://www.lua.org/">Lua</a> to convert Pandoc’s internal representation to the desired output.</p>
<h2 id="how">How?</h2>
<p>You need <code>gmi.lua</code>. <code>default.gmi</code> will probably come in handy.</p>
<pre><code>pandoc -t gmi.lua markdownFile</code></pre>
<p>will convert markdown to text/gmi on stdout. If you want the title block etc etc to show up, use</p>
<pre><code>pandoc -t gmi.lua --template default.gmi markdownFile</code></pre>
<p>To convert from another format, do something like</p>
<pre><code>pandoc -f FORMAT -t gmi.lua --template default.gmi file.format</code></pre>
<p>where <code>FORMAT</code> is one of pandoc’s input formats. See <a href="https://pandoc.org/MANUAL.html#options">options</a> for more details.</p>
<h2 id="omg-the-outputs-horrible">OMG the output’s horrible</h2>
<p>Owing to a combination of my laziness and inadequacy, some of the conversions aren’t very nice.</p>
<p>Oh, you wanted citations?</p>
<pre><code>function Cite(s, cs)
return "\nsorry cite not implemented\n"
end</code></pre>
<h2 id="so-what-are-you-going-to-do-about-it">So what are you going to do about it?</h2>
<p>Nothing.</p>
<p>I may faff about with it to make it better suit my purposes over the next few years.</p>
<p>Should you wish to do the same, read on.</p>
<h2 id="scratching-the-itch">Scratching the itch</h2>
<h3 id="lua">Lua</h3>
<p>Using Lua to customize the output involves writing functions that are invoked when particular bits of pandoc’s internal representation come to be emitted. I don’t understand Pandoc, Haskell or Lua and luckily enough, you don’t have to either.</p>
<p>Doing</p>
<pre><code>pandoc --print-default-data-file sample.lua > sample.lua</code></pre>
<p>gives you a <a href="https://pandoc.org/MANUAL.html#custom-writers">lua program</a> mimicking pandoc’s conversion to <code>html</code></p>
<p><code>gmi.lua</code> is just a hack of <code>sample.lua</code></p>
<h3 id="the-template">The Template</h3>
<ul>
<li><code>default.gmi</code> is a <a href="https://pandoc.org/MANUAL.html#templates">template</a> that specifies output when <code>-s</code>/<code>--standalone</code> is used on the pandoc command line.</li>
<li>In the gmi case pandoc will moan if you use standalone mode without doing <code>--template default.gmi</code>, but it will find <code>default.gmi</code> if it’s in <code>$DATADIR/templates</code>.</li>
<li>if <code>default.gmi</code> isn’t in the template directory you need to specify it’s full path on the command line.</li>
<li>If you specify the template, <code>--standalone</code> is redundant.</li>
<li>see <a href="https://pandoc.org/MANUAL.html#templates">Template documentation</a></li>
<li>Existing templates are in <code>$DATADIR/templates</code>. fwiw I butchered default.html</li>
</ul>
<p>do</p>
<pre><code>pandoc -v | grep "User data directory:" | sed "s/User data directory: //"</code></pre>
<p>to find <code>$DATADIR</code></p>
<h2 id="hacking-lua">Hacking lua</h2>
<p>Lua is fairly straightforward at the expression level so with luck, if you don’t like what I’m doing with subscripts in <code>gmi.lua</code>:</p>
<pre><code>function Subscript(s)
return "_" .. s
end</code></pre>
<p>then, knowing that <code>..</code> is the string concatenation operator, it’s pretty easy to mangle the results. Pandoc puts this on a plate in front of you and, short of eating it for you, there isn’t a lot more they could do.</p>
<p>Pandoc’s sample lua has some funky programming to deal with html output and needs a fairly big rewrite for our purposes, but it gives a good insight into the changes that have to be made. For instance, if you want to do citations, do</p>
<pre><code>pandoc --print-default-data-file sample.lua > sample.lua</code></pre>
<p>as mentioned above and pick the bones out of <code>function Cite(s, cs)</code>.</p>
<h2 id="installation">Installation</h2>
<p>There isn’t any really, you just need <code>gmi.lua</code> and <code>default.gmi</code> somewhere you can remember and then reference them on the command line. Putting <code>default.gmi</code> in <code>$DATADIR/templates</code>, as <a href="https://github.com/jgm/pandoc/blob/master/doc/customizing-pandoc.md#templates">recommended</a> works for me on windows and removes the needto explicitly specify a <code>gmi</code> template, however putting <code>gmi.lua</code> in <code>$DATADIR/filters</code> didn’t work (see -L SCRIPT, –lua-filter=SCRIPT under <a href="https://pandoc.org/MANUAL.html#reader-options">Reader options</a>), but that may be a windows thing?</p>
<h2 id="bugs-c">Bugs &c</h2>
<ul>
<li>too many new lines</li>
<li>links need more work</li>
<li>tables are rendered as csv</li>
<li>double quotes in csv fields aren’t escaped</li>
<li>lists will probably be a problem</li>
<li>not sure about the <code><link>[n]</code> rendering of links</li>
</ul>
<h2 id="other-software">Other Software</h2>
<p><code>md2gmn</code> is an effective markdown to gmi converter - <a href="https://pkg.go.dev/github.com/tdemin/gmnhg">download here</a></p>
</body>
</html>