-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Initial contribution for weasyprint as a service
Refs: #DEV-11712
- Loading branch information
Showing
6 changed files
with
274 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
FROM python:3.12.3-slim | ||
LABEL maintainer="Team Polarion (CLEW/WZU/POLARION) <[email protected]>" | ||
|
||
RUN apt-get update && \ | ||
apt-get --yes --no-install-recommends install python3-cffi python3-brotli libpango-1.0-0 libpangoft2-1.0-0 fonts-liberation chromium && \ | ||
apt-get clean autoclean && \ | ||
apt-get --yes autoremove && \ | ||
rm -rf /var/lib/apt/lists/* | ||
|
||
ENV WORKING_DIR=/opt/weasyprint | ||
ENV CHROME_EXECUTABLE_PATH=/usr/bin/chromium | ||
|
||
WORKDIR ${WORKING_DIR} | ||
|
||
COPY requirements.txt ${WORKING_DIR}/requirements.txt | ||
|
||
RUN pip install --no-cache-dir -r ${WORKING_DIR}/requirements.txt | ||
|
||
COPY ./app/*.py ${WORKING_DIR}/app/ | ||
|
||
ENTRYPOINT [ "python", "app/WeasyprintServiceApplication.py" ] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,36 +1,93 @@ | ||
# Polarion ALM extension to <...> | ||
# WeasyPrint Service | ||
Service providing REST API to use WeasyPrint functionality | ||
|
||
This Polarion extension provides possibility to <...> | ||
## Build | ||
## Build Docker image | ||
|
||
This extension can be produced using maven: | ||
``` | ||
mvn clean package | ||
```bash | ||
docker build \ | ||
--file Dockerfile \ | ||
--tag weasyprint-service:61.2.0 . | ||
``` | ||
|
||
## Installation to Polarion | ||
## Start Docker container | ||
|
||
To install the extension to Polarion `ch.sbb.polarion.extension.<extension_name>-<version>.jar` | ||
should be copied to `<polarion_home>/polarion/extensions/ch.sbb.polarion.extension.<extension_name>/eclipse/plugins` | ||
It can be done manually or automated using maven build: | ||
```bash | ||
docker run --detach \ | ||
--publish 9080:9080 \ | ||
--name weasyprint-service \ | ||
weasyprint-service:61.2.0 | ||
``` | ||
mvn clean install -P polarion2304,install-to-local-polarion | ||
|
||
## Stop Docker container | ||
|
||
```bash | ||
docker container stop weasyprint-service | ||
``` | ||
For automated installation with maven env variable `POLARION_HOME` should be defined and point to folder where Polarion is installed. | ||
|
||
Changes only take effect after restart of Polarion. | ||
## Access service | ||
WeasyPrint Service provides the following endpoints: | ||
|
||
------------------------------------------------------------------------------------------ | ||
#### Getting version info | ||
<details> | ||
<summary> | ||
<code>GET</code> <code>/version</code> | ||
</summary> | ||
|
||
##### Responses | ||
|
||
> | HTTP code | Content-Type | Response | | ||
> |-----------|--------------------|-------------------------------------------| | ||
> | `200` | `application/json` | `{"python":"3.12.3","weasyprint":"61.2"}` | | ||
##### Example cURL | ||
|
||
> ```bash | ||
> curl -X GET -H "Content-Type: application/json" http://localhost:9080/version | ||
> ``` | ||
</details> | ||
------------------------------------------------------------------------------------------ | ||
#### Convert HTML to PDF | ||
<details> | ||
<summary> | ||
<code>POST</code> <code>/convert/html</code> | ||
</summary> | ||
## Polarion configuration | ||
##### Parameters | ||
<...> | ||
> | Parameter name | Type | Data type | Description | | ||
> |----------------------|----------|-----------|----------------------------------------------------------------------| | ||
> | encoding | optional | string | Encoding of provided HTML (default: utf-8) | | ||
> | media_type | optional | string | WeasyPrint media type (default: print) | | ||
> | file_name | optional | string | Output filename (default: converted-document.pdf) | | ||
> | presentational_hints | optional | string | WeasyPrint option: Follow HTML presentational hints (default: False) | | ||
> | base_url | optional | string | Base URL to resolve relative resources (default: None) | | ||
##### Responses | ||
## Extension Configuration | ||
> | HTTP code | Content-Type | Response | | ||
> |-----------|-------------------|-------------------------------| | ||
> | `200` | `application/pdf` | PDF document (binary data) | | ||
> | `400` | `plain/text` | Error message with exception | | ||
> | `500` | `plain/text` | Error message with exception | | ||
<...> | ||
##### Example cURL | ||
> ```bash | ||
> curl -X POST -H "Content-Type: application/html" --data @input_html http://localhost:9080/convert/html --output output.pdf | ||
> ``` | ||
## Usage | ||
</details> | ||
<...> | ||
------------------------------------------------------------------------------------------ | ||
## Changelog | ||
| Version | Changes | | ||
|---------|----------------------------------------------| | ||
| v1.2.0 | Replacing svg images with png using chromium | | ||
| v1.1.0 | Repository refactored + readme updated | | ||
| v1.0.0 | Initial contribution | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
import base64 | ||
import logging | ||
import os | ||
import re | ||
import subprocess | ||
import tempfile | ||
from uuid import uuid4 | ||
|
||
NON_SVG_CONTENT_TYPES = ('image/jpeg', 'image/png', 'image/gif') | ||
|
||
|
||
# Process img tags, replacing base64 SVG images with PNGs | ||
def process_svg(html): | ||
pattern = re.compile(r'<img(?P<intermediate>[^>]+?src="data:)(?P<type>[^;>]*?);base64, (?P<base64>[^">]*?)"') | ||
return re.sub(pattern, replace_img_base64, html) | ||
|
||
|
||
def replace_img_base64(match): | ||
entry = match.group(0) | ||
content_type = match.group('type') | ||
if content_type in NON_SVG_CONTENT_TYPES: | ||
return entry # Skip processing if content type isn't svg explicitly | ||
else: | ||
# We do not require to have 'image/svg+xml' content type coz not all systems will properly set it | ||
content_base64 = match.group('base64') | ||
replaced_content_base64 = replace_svg_with_png(content_base64) | ||
if replaced_content_base64 == content_base64: | ||
# For some reason content wasn't replaced (e.g. it was not a svg) | ||
return entry | ||
else: | ||
return f'<img{match.group("intermediate")}image/svg+xml;base64, {replaced_content_base64}"' | ||
|
||
|
||
# Checks that base64 encoded content is a svg image and replaces it with the png screenshot made by chrome | ||
def replace_svg_with_png(possible_svg_base64_content): | ||
svg_content = base64.b64decode(possible_svg_base64_content).decode('utf-8') | ||
|
||
# Fast check that this is a svg | ||
if '</svg>' not in svg_content: | ||
return possible_svg_base64_content | ||
|
||
chrome_executable = os.environ.get('CHROME_EXECUTABLE_PATH') | ||
if not chrome_executable: | ||
logging.error('CHROME_EXECUTABLE_PATH not set') | ||
return possible_svg_base64_content | ||
|
||
# Fetch width & height from root svg tag | ||
match = re.search(r'<svg[^>]+?width="(?P<width>[\d.]+)', svg_content) | ||
if match: | ||
width = match.group('width') | ||
else: | ||
logging.error('Cannot find svg width in ' + svg_content) | ||
return possible_svg_base64_content | ||
|
||
match = re.search(r'<svg[^>]+?height="(?P<height>[\d.]+)', svg_content) | ||
if match: | ||
height = match.group('height') | ||
else: | ||
logging.error('Cannot find svg height in ' + svg_content) | ||
return possible_svg_base64_content | ||
|
||
# Will be used as a name for tmp files | ||
uuid = str(uuid4()) | ||
|
||
temp_folder = tempfile.gettempdir() | ||
|
||
# Put svg into tmp file | ||
svg_filepath = os.path.join(temp_folder, uuid + '.svg') | ||
f = open(svg_filepath, 'w', encoding='utf-8') | ||
f.write(svg_content) | ||
f.close() | ||
|
||
# Feed svg file to chrome | ||
png_filepath = os.path.join(temp_folder, uuid + '.png') | ||
result = subprocess.run([ | ||
f'{chrome_executable}', | ||
'--headless', | ||
'--no-sandbox', | ||
'--default-background-color=00000000', | ||
'--hide-scrollbars', | ||
f'--screenshot={png_filepath}', | ||
f'--window-size={width},{height}', | ||
f'{svg_filepath}', | ||
]) | ||
|
||
# Get resulting screenshot content | ||
with open(png_filepath, 'rb') as img_file: | ||
img_data = img_file.read() | ||
png_base64 = base64.b64encode(img_data).decode('utf-8') | ||
|
||
# Remove tmp files | ||
os.remove(svg_filepath) | ||
os.remove(png_filepath) | ||
|
||
if result.returncode != 0: | ||
logging.error('Error converting to png') | ||
return possible_svg_base64_content | ||
else: | ||
return png_base64 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
import logging | ||
import platform | ||
from urllib.parse import unquote | ||
|
||
import weasyprint | ||
from flask import Flask, Response, request | ||
from gevent.pywsgi import WSGIServer | ||
|
||
import SvgUtils | ||
|
||
app = Flask(__name__) | ||
|
||
|
||
@app.route("/version", methods=["GET"]) | ||
def version(): | ||
return { | ||
"python": platform.python_version(), | ||
"weasyprint": weasyprint.__version__ | ||
} | ||
|
||
|
||
@app.route("/convert/html", methods=["POST"]) | ||
def convert_html(): | ||
try: | ||
encoding = request.args.get("encoding", default="utf-8") | ||
media_type = request.args.get("media_type", default="print") | ||
file_name = request.args.get("file_name", default="converted-document.pdf") | ||
presentational_hints = request.args.get("presentational_hints", default=False) | ||
|
||
base_url = request.args.get("base_url", default=None) | ||
if base_url: | ||
base_url = unquote(base_url, encoding=encoding) | ||
|
||
html = request.get_data().decode(encoding) | ||
html = SvgUtils.process_svg(html) | ||
weasyprint_html = weasyprint.HTML(string=html, base_url=base_url, media_type=media_type, encoding=encoding) | ||
output_pdf = weasyprint_html.write_pdf(presentational_hints=presentational_hints) | ||
|
||
response = Response(output_pdf, mimetype="application/pdf", status=200) | ||
response.headers.add("Content-Disposition", "attachment; filename=" + file_name) | ||
return response | ||
|
||
except AssertionError as e: | ||
return process_error(e, "Assertion error, check the request body html: " + str(e), 400) | ||
except (UnicodeDecodeError, LookupError) as e: | ||
return process_error(e, "Cannot decode request html body: " + str(e), 400) | ||
except Exception as e: | ||
return process_error(e, "Unexpected error due converting to PDF: " + str(e), 500) | ||
|
||
|
||
def process_error(e, err_msg, status): | ||
logging.exception(e) | ||
return Response(err_msg, mimetype="plain/text", status=status) | ||
|
||
|
||
def start_server(port): | ||
http_server = WSGIServer(("", port), app) | ||
http_server.serve_forever() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
import argparse | ||
import logging | ||
|
||
import WeasyprintController | ||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser(description="Weasyprint service") | ||
parser.add_argument("--port", default=9080, type=int, required=False, help="Service port") | ||
args = parser.parse_args() | ||
|
||
logging.getLogger().setLevel(logging.INFO) | ||
logging.info("Weasyprint service listening port: " + str(args.port)) | ||
logging.getLogger().setLevel(logging.WARN) | ||
|
||
app = WeasyprintController.start_server(args.port) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
###### Requirements without Version Specifiers ###### | ||
flask | ||
gevent | ||
###### Requirements with Version Specifiers ###### | ||
weasyprint==61.2 |