Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Allow use of a custom PDF generator #10275

Open
merlinschumacher opened this issue Oct 16, 2024 · 2 comments
Open

[Feature Request] Allow use of a custom PDF generator #10275

merlinschumacher opened this issue Oct 16, 2024 · 2 comments
Labels
pdf Produce PDF as the output format

Comments

@merlinschumacher
Copy link

I'm not really happy with the way the browser engines render PDFs from the markdown files. Code blocks get split up in pages, sometimes lines are split in the middle, etc. So I've created a custom LaTeX template for pandoc to do these conversions, instead of docfx. Nevertheless, docfx is great and provides a nice solution for .NET project documentation.

I'd love to be able to change the default executor for the PDF conversion from Chromium to a custom command, that takes specified arguments and builds a PDF that docfx then integrates seamlessly.

Currently, I've implemented a hacky solution, where I explictly link the PDF files in the tocs as child items, and call pandoc before running docfx. It works, but it's clunky.

@yufeih
Copy link
Contributor

yufeih commented Oct 17, 2024

@merlinschumacher, I'm curious about which custom PDF generator you prefer. Is it a proprietary tool? Docfx used to usewkhtmltopdf, but it hasn't been actively maintained. Is there an alternative you're using now?

@merlinschumacher
Copy link
Author

I use pandoc in combination with a custom LaTeX template. That's essentially all. The LaTeX template is just a slightly modified version of pandoc's default template. And there are even popular templates like Eisvogel that are built for exactly the purpose of converting Markdown to PDF and looking good while at it.

Pandoc can also receive metadata, that are used in the resulting files. So I've been able to inject information like the build date of the files into the PDFs, using metadata and corresponding placeholders in the template.

Pandoc is available for all major platforms and the most common required decencies are as well. On Windows even via chocolatey or winget

There are also pandoc filters for plantuml and mermaid. But I didn't get around to check these out, yet.

At the moment I use a python script, that calls pandoc and docfx one after another inside a custom made docker image. The docker image is used in a CI/CD pipeline, where I generate the output. My setup relies on Inkscape for pandoc, which pulls a lot of dependencies, but I believe I can replace it with something smaller like rsvg-convert.

For the conversion from HTML to PDF pandoc relies on Weasyprint, which seems to support CSS as well, and it's said it has better support for print related CSS rules. But that one I didn't test.

@yufeih yufeih added the pdf Produce PDF as the output format label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pdf Produce PDF as the output format
Projects
None yet
Development

No branches or pull requests

2 participants