-
-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] using Weasyprint to generate pdf #254
Comments
yes, writing This will get us beautiful reports with full page backgrounds and proper page margin boxes, amongst others. What I think will be a little bit of a challenge is to add some compatibility layer so that we can actually replace wkhtmltopdf with it. But I think for starters the flag (or maybe a second selection field |
I see in the documentation that weasyprint is able to produce PDF with attachment, which is really cool (cf https://weasyprint.readthedocs.io/en/latest/api.html) ! But it doesn't seem to have an option to produce PDF/A (which would make it the perfect tool to generate compliant Factur-X invoices !)... i didn't see anything about it in the documentation. @liZe do you confirm that ? |
The docs say:
IMHO it should be better then to try to use a full rendering engine, one that matches a good browser. That should lead to least surprises. Odoo v12+ uses chrome/chromium to run JS tests. Maybe we can use it to render PDFs. Of course this is a general idea, and I have not hacked on it. Also I'd appreciate it if you open the proposal for Odoo master, since this is a general problem. It can be backported then. They claim to be open to suggestions: odoo/odoo#21255 (comment) |
check out the PR l inked above, this is a working addon which uses chrome for the rendering. The thing is when you read the mailinglists and bug trackers of chrome and firefox, most people don't seem very interested in paged media stuff within browsers. That makes sense to me, as a nice printout isn't exactly a killer feature for a browser. So I personally wouldn't hold my breath until any of the above implements CSS3 page margins and counters for example. Just going ahead and contributing it to one of the projects will be weeks or months of full time work I assume, and that doesn't include getting familiar with the project in the first place. |
Hmm I understand... Yes, that makes sense, thanks! |
FYI we have considered using Weasyprint to render reports in Odoo 12.0. The issue was that, in its current state, Weasyprint does not scale well: rendering very large documents simply blows up memory. That is because all the rendering is done in pure Python. Nice for readability, but not for performance... |
Have you considered a different engine then? |
Hello everyone. I'm the current lead developer of WeasyPrint, and of course I would be really happy to see Odoo use WeasyPrint to render PDF files. As it has been said in previous comments, WeasyPrint is a rendering engine dedicated to print, thus supporting interesting CSS features that are not available in browsers like:
These features are really important if you want to produce high quality documents, and not just "enhanced screenshots" from common browsers. I'm convinced that really supporting paged media can open great opportunities, not only to generate invoices and reports, but also badges, tickets, business cards, posters, etc. Of course, WeasyPrint comes with its downsides too. The first one is the lack of some CSS features (like The second one is the speed and memory consumption. As said before, WeasyPrint is slow compared to browsers and eats a lot of memory for very large documents. But contrary to what @rco-odoo said (and what most people believe, no offense 😉), the main reason for that is WeasyPrint's pretty simple (stupid?) algorithms, not Python. Of course, we'll never reach Chrome's or Firefox' performances using Python, but we can get large improvements just by avoiding useless code executions and freeing memory at the right moment. Speed and memory have largely been secondary concerns for now (compared to shining new CSS features that everybody want), so it means that we can greatly improve that point with some work. And having a small amount of code (about 14k lines of Python) with a lot of tests is good news for that. Now that 8-year-old WeasyPrint has reached its first stable release, we have started a lot of actions to understand the real needs of our users and spend more time on what really counts for professional use. We're contacting users to get testimonials and understand their problems, we've started a campaign with bounties and paid support, we're creating samples to help people to discover CSS features and see how useful they can be… So, we want to spend more time developing WeasyPrint, it looks like the perfect moment to improve it for Odoo users' needs.
I don't know how Odoo works (yet!), but allowing multiple PDF renderers and let users choose the one they want seems to be a good idea.
Prince, for example, but it's not open source and quite expensive. Of course, I'd prefer spending money on developing WeasyPrint 😄, but it's great software to be honest.
It is!
Supporting PDF/A seems to be much easier than PDF/X, and of course Factur-X would be really useful. (See Kozea/WeasyPrint#630 and Akretion's factur-x.) |
Well, I thought this at first, because otherwise we should refactor Odoo into C to get it fast, but that's not gonna be the case (I hope! 😆). I'd love to be able to hack into the PDF renderer using just Python. Currently wkhtmltopdf is like a blackbox to me (and many Odoo devs I guess). Maybe @rco-odoo can provide some samples of reports that worked bad with Weasyprint? |
I pushed a working proof of concept to #256 But we're quite far away from being able to use this as dropin replacement for wkhtmltopdf:
|
That was fast!
Another problem related to this feature is split boxes not taking the whole page (see this test where purple borders should reach the bottom of each page).
It's a working draft, but so are many features implemented in WeasyPrint. And named strings are implemented to solve some related cases. So… Why not! (Why is that related to box-decoration-break?) |
when you have |
For the record: box-decoration-break support has been added (see Kozea/WeasyPrint#771). |
look at #256 (that's also linked a bit up this thread) - contributions are welcome |
@liZe thanks for the heads up, this nearly works! I've problems with assigning a margin to the body element with cloning the box decoration (the top margin only exists on the first page there), but padding works. |
You can post your HTML and CSS samples with a short explanation about what you expect, I can take a look. |
@liZe attached a minimal example. The spec about generated content only talks about margin boxes, but I think those would be super useful for position: fixed elements too: <html>
<head>
<style type="text/css">
@page {
size: A4;
margin: 0px;
}
.current_heading:after {
content: string(title);
}
.current_page:after {
content: counter(page);
}
.page_count:after {
content: counter(pages);
}
.header {
text-align: center;
position: fixed;
top: 0px;
left: 0px;
right: 0px;
height: 1cm;
padding: 0.5cm;
background: blue;
}
.footer {
text-align: center;
position: fixed;
bottom: 0px;
left: 0px;
right: 0px;
height: 1cm;
padding: 0.5cm;
background: green;
}
html {
margin: 0px;
padding: 0px;
}
body {
margin: 0px;
padding: 2cm;
box-decoration-break: clone;
}
h1 {
string-set: title content();
page-break-before: always;
}
</style>
</head>
<body>
<div class="header">
I'm a header: <span class="current_heading" />
</div>
<section>
<h1>This should show up in the header on the first page</h1>
hello<span style="page-break-after: always" />world
<h1>This should show up in the header on the second page</h1>
</section>
<div class="footer">
Page <span class="current_page"/> of <span class="page_count" />
</div>
</body>
</html> |
There hasn't been any activity on this issue in the past 6 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days. |
After struggling with wkhtmltopdf to get both good header and footer for an entire day, I gave a spin to the With some code tweaking, I also managed to use Chrome headless as a renderer (I could make a branch for others to test) but the results were entirely broken too (although general layout was OK...) I also tested Prince, and to be fair although the results were somewhat better they were also not acceptable. With respect to padding/margin sizes, wkhtmltopdf is clearly the odd one out, however the default styles have been written for it, so this would be an additional hurdle for a switch (unless someone already has decent tweaks?). Other approaches I tested are not really suitable with respect to style. At this point I think it would make sense to generate headers/footers as separate pdfs, and stitch them together to the main content I there's some work on another solution I'd be willing to test/help. |
That’s a proof of concept, not a branch ready to be merged 😄. About performance, a large amount of time is spent downloading countless fonts that are not used or don’t exist anymore. These fonts are in the default stylesheet. They don’t take that much time using Chrome or wkhtmltopdf because they parallelize
For example WeasyPrint can use running elements for headers and footers.
Many features (including page headers and footers) work using CSS with Web2Print solutions like WeasyPrint or Prince, while they were using a dedicated API for wkhtmltopdf, that’s why the rendering is broken depending on the content. I can definitely give hints about that if you want. In my opinion, no solution will automatically give correct rendering without spending some time transforming the existing stylesheets. But I doubt that there’s something currently done by wkhtmltopdf and that can’t be done with other recent Web2Print tools. |
Hello, I think we should also be cautious if Odoo SA is going to increase the usage of JavaScript inside reports (weasyprint will never deal with JavaScript). Here is a commit they did in that direction yesterday for instance: Considering the trend is to unify the PDFs reports and the portal presentation (like sign an order in the portal that should look like the pdf order) and considering Odoo SA is betting a lot on its owl stuff, one cannot guarantee JavaScript usage in reports will not increase. IMHO Weasyprint could be an alternative solution for some specific reports (say like py3o/libreoffice is) but hardly a general purpose replacement (I put more faith into the embedded browsers solutions, be it inside a microservice to avoid cold starts). |
Then it’s probably safer to use a tool based on a browser, even if it’s limited regarding pagination features. |
Well, thank you very much @liZe for the detailed answer. I concur that given the constraints and what Raphaël mentioned, it would make more sense that this could only be one rendering engine and not a full drop-in replacement for pdf generation.
I'm wondering how much time would have to be invested in tweaking it to get to a working solution; and in that case I'd be really afraid to spend days and never manage to get it working, or that it would be really broken the day the header is one line longer or something is two-pages long instead. The PoC module contains some tweaking (with a similar one for header):
However, I didn't catch that it contains the comment Screenshot is left WeasyPrint, middle WeasyPrint with tweaks, right Prince with the tweaks (it has it's fair share of issues). The tweaks are directly from https://github.com/legalsylvain/reporting-engine/blob/16.0-ADD-weasyprint-lru-cache-idiot/report_qweb_weasyprint_renderer/static/src/css/report_qweb_weasyprint_renderer_wkhtmltopdf_compat.css If you have a suggestion for a better css that should work better I'm all for testing it. |
No problem!
Here are some quick tips:
I don’t remember where the counter comes from, but we managed to remove it during our tests. If you think that it could be useful to spend some time to get a more solid proposal, we can definitely work with @legalsylvain (who knows Odoo much better than me!) just as we already did to get this PoC. There’s no need for you to spend a lot of (often very frustrating 😄) time trying to fix everything in the stylesheet, I’ll probably be more efficient! It could be useful to have a list of documents with various content to test, so that we’re sure that the PoC fits your needs. And, of course, no offense if some points are true blockers (such as JS support), there’s no need to work on a proposal that can’t be merged anyways! |
of course the whole point of this is to not need js-hacks (which is what happens in Odoo with wkhtmltopdf), and just do proper css3 paged media. it's all there (as in specification), we just need to implement it |
If I don't remember bad, one of the problems of Weasyprint was the performance. Has that been addressed? |
Not exactly right. Well, weasyprint is slow as soon as the html code is totally hugly and bad designed. For exemple, when you generate a html report in a recent Odoo version, there are 50 calls to external ressources. ( I mean on internet). of the 50, 25 return a 404 error... After some patches in Odoo code, to remove that useless queries, weasyprint is more or less as fast as wkhtmltopdf. I did a PoC patching odoo master with @liZe during a day, 2 monthes ago. The PoC was "replace wkhtmltopdf by weasyprint" in the core module. after a little work and some patches :
At this step, two ways : The ball is in @bouvyd's court, whom I contacted a few months ago. |
Hello everyone, I'm an intern working for Odoo Buffalo. I made a comment earlier but deleted as I felt I was still a little under-informed on this topic. I've been tasked with replacing wkhtmltopdf with WeasyPrint as an internship project. I don't think the organizers initially realized the scope of the task, but told me I was free to continue working on it if I so desired. I'm still a novice dev (one month experience working at Odoo and still working on my CS degree) and it seems like a pretty daunting task, but from what I've gathered this could be a really valuable improvement to Odoo's software. Suffice to say, I have the opportunity to devote 100% of my time at work to develop this, every day until my internship ends mid-August, and potentially beyond if I end up working for the company after graduation. I just started and right now I'm still working on fully understanding how wkhtmltopdf is integrated into Odoo and studying the proof-of-concepts already developed. Do you folks think it would be a bad idea for an inexperienced dev to take this on, or should I go for it? I'd appreciate any guidance on how I can best spend my time and contribute to this development. I don't want to waste time trying to write code other people have already written. Thanks. |
I've been able to run Sylvain's branch with barely any modification: Basically all the basic integration work is done in it, so if you're able to understand that module code then you're good to go. In order to truly use it as a replacement the daunting task is to adapt the CSS and templates to weasyprint and optimize everything, as noted by Sylvain above. It was also noted earlier that running elements should be moved in the template so that the footer will appear on the first page for instance. |
Hey @cmal-odoo Remember we were all novice once. I think you have a great opportunity to learn and to make an excellent contribution to Odoo. This is not a task which requires you to know everything about Odoo to begin with. Wkhtmltopdf is an obsolete library which requires replacing and, to my understanding, it sounds like Weasyprint (even though it has some limitations of course) is a pretty good candidate. Just go for it 💪 It won't be easy, but worst thing that can happen is that you are not able to finish all the tweaking required to make all PDF reports to work well with the new library. Or that you arrive to the conclusion that there is a better option. Study the proof-of-concepts, lean on community and internal Odoo people to help you, and you'll make it. Courage 😉 |
@pedrobaeza On complex pages performance is ridiculously bad, i.e. compare www.odoo.com on wk vs wp. On more standard docs it is still somewhat slower but perhaps acceptable if no other alternative (There are some benchmarks Odoo did somewhere in issues for both these cases). I think however we do need to do benchmarks with things like 100 page accounting reports for both performance and memory use compared to wk. |
Hello. While I think it's okay to assume very large ledgers should not be printed as pdf, I think @gdgellatly is right, I think common 100 pages (even 50 if you like) accounting reports should be available, otherwise it will just make the workflow more painful even in simple cases. You'll also find many industry printing many pages of large pickings/manufacturing orders. @cmal-odoo I suggest you double check with the Belgian R&D team if they are aligned with the no-Javascript report vision and if it's okay considering the future OWL ubiquity or even the need to render some pdf as HTML page in the portal (like sign an order in the portal). Specially I suggest you check with Antony Lesuisse or Gery Debongnie. Cause for a general purpose report engine there are other options like embedded browsers such as https://pptr.dev/ or Firefox equivalent that can still be made fast if deployed as a micro-service (avoids cold starts, it's already what some do with py3o Libreoffice reports long ago BTW https://github.com/OCA/reporting-engine/tree/14.0/report_py3o_fusion_server ). Again, I think Weasyprint could certainly be a very good report engine option, I just question if it can become the default engine. Thank you for addressing this important issue. |
Indeed, weasyprint has bad performance, on html that is not designed to be printed. and odoo.com is not printable. if you go to odoo.com, it will load 29 ressources. the total size is 10Mb (without images). the size of If you open odoo.com with google chrome, and click print, then export in pdf, you'll see :
TLDR : As long as you supply poorly generated html code that's too big because of useless data, and not designed for printing, wkhtmltopdf won't do the trick. You have to look at the problem the other way round: think html code for print. |
Hello @legalsylvain I somewhat agree with that. But liking it or not, who decides the roadmap of Odoo so far is Odoo SA, not us. that's why I asked to check if Odoo R&D is aligned with this. Cause if the Odoo roadmap is "wow effects" all over the place to have the same pdf as your portal OWL based page Weasyprint is not going to cut it. That's why I say no doubt we can use Weasyprint as a nice alternative engine, but as for being the default engine, it will really be up to Odoo SA top R&D decisions. Finally, about sobriety, don't buy that one, cause browser CSS engines will always be orders of magnitude more optimized to render HTML than Python emulated CSS. The amount of engineering put to optimize these engines with native efficient languages with parallelism is simply out of reach, think Google invested all its rivality with Microsoft into exactly this... |
You can also use LibreOffice as an alternative:
It's slower and the PDF is even uglier! 🚀 😆 If you're rendering a local html file it's faster. I downloaded that website with assets and look at it:
Open Source ERP and CRM Odoo.pdf I don't think it's better than weasyprint, but it's worth noting another working and maintained alternative. Dumb question: why Odoo doesn't fork wkhtmltopdf and keep their fork up to date? After all, the main problem with it is that it's got a big collection of CVE accumulated by the lack of maintenance of its qt-webkit engine. It just needs security maintenance, not big new fancy features. Probable answer: in Odoo there are many devs, but not much qt, c and webkit knowledge. If that's the answer, then Weasyprint does seem like a nice option. It is python, so it's much easier for any Odoo dev to diagnose and fix any bottlenecks, as the toolkit to do that is the same. Both Odoo and Weasyprint could benefit each other by working together IMHO. Big question: are you guys going to make sure the reports look the same? If they don't, then I can see thousands of tickets from each customer saying "my invoices look different, what happened?" And that's gonna be PITA. |
There hasn't been any activity on this issue in the past 6 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days. |
Hello,
This issue is here to talk about the opportunity to integrate Weasyprint into odoo. feel free to ask question.
Reference :
CC : @liZe, alias guillaume Ayoub, CEO of Kozea that has developed the librairy.
The text was updated successfully, but these errors were encountered: