TextIndex: create back-of-book indexes from Markdown and other plain-text formats #11007
mattgemmell
started this conversation in
Show and tell
Replies: 1 comment 9 replies
-
This looks great! I recently had to generate an index for a book in LaTeX/PDF and EPUB versions, and I might have used this if it had been available. (In the end I used inline LaTeX indexing commands + a Lua filter that converted these to something appropriate for the EPUB.) |
Beta Was this translation helpful? Give feedback.
9 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello. I've made something that might be useful for those who want to add indexes to their publications, when using plain-text source formats like Markdown.
My pandoc usage is entirely based on Markdown to HTML-derived formats, including HTML itself, epub3, and PDF for print via weasyprint. I wanted a way to generate indexes (in the back-of-book sense) purely from Markdown without going via latex, and after a bit of research I decided to create a micro-syntax (and parser script, in Python). It's called TextIndex, and it allows adding "index marks" to plain-text documents, which are then compiled into an HTML index and inserted into the document wherever you like. This is conceptually similar to Microsoft Word's approach to indexes.
I use it as a pre-processor before pandoc and also as a standalone tool, and I'm finding the results to be more than satisfactory. Genuine page-numbers are available for paginated formats (via the CSS Generated Content spec), and there's a lot of index-related functionality available, including cross-references, custom sorting of entries, hierarchical headings, locator emphasis and suffixing, running-in of deeply nested entries, and a number of conveniences to make indexing easier and quicker. I largely used the Chicago Manual of Style to guide the formatting, within the constraints of simplicity.
It's of course open source (GPL3), and the repository is available on github. You can read the full documentation here, with a sample index (for that page itself, generated with TextIndex) at the end.
A couple of screenshots of sample output are attached. I hope the project will be useful to others too.
Beta Was this translation helpful? Give feedback.
All reactions