Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use libreoffice to generate pdf file from docx #1202

Merged
merged 29 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
0b74a64
POC which generate pdf file from docx locally using libreoffice
kolok Feb 1, 2024
42dde49
update code to use the appimage stuff
etchegom Mar 21, 2024
86030e0
Refacto tasks
etchegom Apr 2, 2024
1d24354
fixing tests
etchegom Apr 4, 2024
410554b
convertapi cleanup
etchegom Apr 4, 2024
ab05f2d
update doc
etchegom Apr 4, 2024
ff647f4
add a command to make conversion locally
etchegom Apr 5, 2024
e8ff1f1
fix tests
etchegom Apr 5, 2024
c759b02
fix merge
kolok Apr 17, 2024
e03b48f
POC which generate pdf file from docx locally using libreoffice
kolok Feb 1, 2024
7a8aad7
update code to use the appimage stuff
etchegom Mar 21, 2024
b573e6c
Refacto tasks
etchegom Apr 2, 2024
ce7beb3
fixing tests
etchegom Apr 4, 2024
5454bff
convertapi cleanup
etchegom Apr 4, 2024
b6ffb1f
update doc
etchegom Apr 4, 2024
fe66d70
add a command to make conversion locally
etchegom Apr 5, 2024
aa43c88
fix tests
etchegom Apr 5, 2024
1727bd7
fix merge
kolok Apr 17, 2024
716723c
Merge branch 'poc_libreoffice' of github.com:MTES-MCT/apilos into poc…
etchegom Apr 23, 2024
42df7a6
fix type of doc -> docxtemplate
kolok Apr 23, 2024
2f27627
Merge branch 'poc_libreoffice' of github.com:MTES-MCT/apilos into poc…
etchegom Apr 23, 2024
e790f68
run ruff
etchegom Apr 23, 2024
625ebf0
new tests
etchegom Apr 23, 2024
d1ca2d1
fix docx templates
etchegom Apr 24, 2024
18abd7d
fix docx templates
etchegom Apr 24, 2024
0ab3a2f
fix docx templates
etchegom Apr 24, 2024
53eb877
fix docx templates
etchegom Apr 24, 2024
de057b9
Merge branch 'main' into poc_libreoffice
kolok Apr 25, 2024
c31b5a4
Merge branch 'main' into poc_libreoffice
kolok Apr 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .buildpacks
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
https://github.com/Scalingo/apt-buildpack
https://github.com/Scalingo/nodejs-buildpack.git
https://github.com/Scalingo/python-buildpack.git
https://github.com/BlueTeaLondon/heroku-buildpack-libreoffice-for-heroku-18.git
6 changes: 3 additions & 3 deletions .env.template
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,6 @@ ALLOWED_HOSTS=local.beta.gouv.fr
# Comment sentry setting to deactivate it
SENTRY_URL=

# Comment convertapi setting to deactivate it
CONVERTAPI_SECRET=<CONVERTAPI_SECRET>

# INSEE API settings
# INSEE_API_KEY=<INSEE_API_KEY>
# INSEE_API_SECRET=<INSEE_API_SECRET>
Expand Down Expand Up @@ -71,3 +68,6 @@ TEST_DOT_ENV_FILE=.env.test.local
# CLAMAV_SERVICE_URL=http://localhost:3320
# CLAMAV_SERVICE_USER=app1
# CLAMAV_SERVICE_PASSWORD=letmein

# LibreOffice for MacOS
LIBREOFFICE_EXEC=/Applications/LibreOffice.app/Contents/MacOS/soffice
3 changes: 0 additions & 3 deletions .env.test
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,5 @@ ALLOWED_HOSTS=local.beta.gouv.fr
# Comment sentry setting to deactivate it
SENTRY_URL=

# Comment convertapi setting to deactivate it
CONVERTAPI_SECRET=

# Used to test SIAP experience without link to SIAP API
USE_MOCKED_SIAP_CLIENT=False
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,5 @@ node_modules/
# celery beat schedule files
celerybeat-schedule.dat
celerybeat-schedule.dir
celerybeat-schedule.bak
celerybeat-schedule.bak
celerybeat-schedule
1 change: 1 addition & 0 deletions .slugignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
node_modules
8 changes: 8 additions & 0 deletions Aptfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
libsm6
libice6
libxinerama1
libdbus-glib-1-2
libharfbuzz0b
libharfbuzz-icu0
libx11-xcb1
libxcb1
2 changes: 1 addition & 1 deletion DEVELOPPEUR.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ APiLos est un outils permettant de générer un document contractuel de conventi

La génération de document de convention au format .docx est prise en charge par la librairie [python-docx-template](https://docxtpl.readthedocs.io/en/latest/) qui utilise le moteur de template Jinja2 pour modifier le template des documents de conventions APL. Les templates de docuements sont dans le dossier [./documents](./documents/)

Une fois la convention validée par les deux parties, celle-ci est envoyée au format pdf par email au bailleur. Le service ConvertAPI est utilisé pour générer une version pdf du document docx.
Une fois la convention validée par les deux parties, celle-ci est envoyée au format pdf par email au bailleur. L'application [Libreoffice](https://https://fr.libreoffice.org/discover/libreoffice/) est utilisée pour générer une version pdf du document docx.

##### Import de firchiers excel

Expand Down
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ Les solutions tierces utilisées par APiLos:
- Scalingo : hébergement de la solution (cloud souverain)
- Scaleway : stockage de fichiers compatible avec le protocole S3
- Brevo : envoi de courriels transactionnels
- ConvertAPI : conversion de document de convention au format PDF
- Sentry : monitoring logiciel
- Github : gestion des versions du code source d’ APiLos et chaîne de tests et de mise en production (CI/CD)

Expand Down
61 changes: 61 additions & 0 deletions conventions/management/commands/generate_pdf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
import subprocess

from django.core.management.base import BaseCommand

from conventions.models import Convention
from conventions.services.convention_generator import (
generate_convention_doc,
get_tmp_local_path,
run_pdf_convert_cmd,
)


class Command(BaseCommand):
def add_arguments(self, parser):
parser.add_argument(
"--convention-uuid",
help="Convention UUID",
required=True,
)

def handle(self, *args, **options):
convention_uuid = options["convention_uuid"]

try:
convention = Convention.objects.get(uuid=convention_uuid)
except Convention.DoesNotExist:
self.stdout.write(
self.style.ERROR(
f"Convention with UUID {convention_uuid} does not exist"
)
)
return

local_path = get_tmp_local_path()
local_docx_path = local_path / f"convention_{convention_uuid}.docx"
local_pdf_path = local_path / f"convention_{convention_uuid}.pdf"

doc = generate_convention_doc(convention=convention)
doc.save(filename=local_docx_path)
self.stdout.write(self.style.SUCCESS(f"Generated DOCX file: {local_docx_path}"))

try:
result = run_pdf_convert_cmd(
src_docx_path=local_docx_path,
dst_pdf_path=local_pdf_path,
)

if result.returncode != 0:
self.stdout.write(
self.style.ERROR(
f"Error while converting DOCX to PDF: {result.stderr}"
)
)
return

self.stdout.write(
self.style.SUCCESS(f"Generated PDF file: {local_pdf_path}")
)

except subprocess.CalledProcessError as err:
self.stdout.write(self.style.ERROR(f"Error: {err}"))
111 changes: 68 additions & 43 deletions conventions/services/convention_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
import json
import math
import os
import subprocess
from pathlib import Path

import convertapi
import jinja2
from django.conf import settings
from django.core.files.storage import default_storage
Expand Down Expand Up @@ -86,20 +87,27 @@ def _compute_total_locaux_collectifs(convention):
)


def get_or_generate_convention_doc(convention: Convention, save_data=False):
def get_or_generate_convention_doc(
convention: Convention, save_data=False
) -> DocxTemplate:
if convention.fichier_override_cerfa and convention.fichier_override_cerfa != "{}":
files_dict = json.loads(convention.fichier_override_cerfa)
files = list(files_dict["files"].values())

if isinstance(files_dict["files"], dict):
files = list(files_dict["files"].values())
else:
files = []

if len(files) > 0:
file_dict = files[0]
uploaded_file = UploadedFile.objects.get(uuid=file_dict["uuid"])
return UploadService().get_file(
uploaded_file.filepath(str(convention.uuid))
)
filepath = uploaded_file.filepath(str(convention.uuid))
return DocxTemplate(default_storage.open(filepath, "rb"))

return generate_convention_doc(convention=convention, save_data=save_data)


def generate_convention_doc(convention: Convention, save_data=False):
def generate_convention_doc(convention: Convention, save_data=False) -> DocxTemplate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ca me parait mieux que generate_convention_doc retourne une instance de doc plutôt que des bytes.
Ca simplifie également par la suite ici, pour l'écriture dans un fichier local.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

annexes = (
Annexe.objects.prefetch_related("logement")
.filter(logement__lot_id=convention.lot_id)
Expand Down Expand Up @@ -185,9 +193,6 @@ def generate_convention_doc(convention: Convention, save_data=False):
context.update(adresse)

doc.render(context, _get_jinja_env())
file_stream = io.BytesIO()
doc.save(file_stream)
file_stream.seek(0)

for local_path in list(set(local_pathes)):
os.remove(local_path)
Expand All @@ -201,7 +206,7 @@ def generate_convention_doc(convention: Convention, save_data=False):
logements_totale,
)

return file_stream
return doc


def typologie_label(typologie: str):
Expand Down Expand Up @@ -324,46 +329,66 @@ def _save_convention_donnees_validees(
convention.save()


def generate_pdf(file_stream: io.BytesIO, convention: Convention):
# save the convention docx locally
local_docx_path = str(settings.MEDIA_ROOT) + f"/convention_{convention.uuid}.docx"
def get_tmp_local_path() -> Path:
local_path = Path(settings.MEDIA_ROOT, "tmp")
local_path.mkdir(parents=True, exist_ok=True)
return local_path
Comment on lines +332 to +335
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposition: écrire les fichiers locaux temporaire dans un répertoire "tmp" plutôt qu'à la racine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


# get a pdf version
if settings.CONVERTAPI_SECRET:
with open(local_docx_path, "wb") as local_file:
local_file.write(file_stream.read())
local_file.close()

convertapi.api_secret = settings.CONVERTAPI_SECRET
result = convertapi.convert("pdf", {"File": local_docx_path})
class PDFConversionError(Exception):
pass

convention_dirpath = f"conventions/{convention.uuid}/convention_docs"
convention_filename = f"{convention.uuid}.pdf"
pdf_path = _save_io_as_file(
result.file.io, convention_dirpath, convention_filename
)

# remove docx version
os.remove(local_docx_path)
else:
convention_dirpath = f"conventions/{convention.uuid}/convention_docs"
convention_filename = f"{convention.uuid}.docx"
pdf_path = _save_io_as_file(
file_stream, convention_dirpath, convention_filename
)
def run_pdf_convert_cmd(
src_docx_path: Path, dst_pdf_path: Path
) -> subprocess.CompletedProcess:
return subprocess.run(
[
settings.LIBREOFFICE_EXEC,
"--headless",
"--convert-to",
"pdf:writer_pdf_Export",
"--outdir",
dst_pdf_path.parent,
src_docx_path,
],
check=True,
capture_output=True,
)

file_stream.seek(0)

# END PDF GENERATION
return pdf_path
def generate_pdf(doc: DocxTemplate, convention_uuid: str) -> None:
local_path = get_tmp_local_path()
local_docx_path = local_path / f"convention_{convention_uuid}.docx"
local_pdf_path = local_path / f"convention_{convention_uuid}.pdf"

# Save the convention docx locally
doc.save(filename=local_docx_path)

def _save_io_as_file(file_io, convention_dirpath, convention_filename):
upload_service = UploadService(
convention_dirpath=convention_dirpath, filename=convention_filename
)
upload_service.upload_file_io(file_io)
return f"{convention_dirpath}/{convention_filename}"
# Generate the pdf file from the docx file, and upload it to the storage
try:
result = run_pdf_convert_cmd(
src_docx_path=local_docx_path, dst_pdf_path=local_pdf_path
)
if result.returncode != 0:
raise PDFConversionError(
f"Error while converting the docx file to pdf: {result.stderr}"
)

UploadService(
convention_dirpath=f"conventions/{convention_uuid}/convention_docs",
filename=f"{convention_uuid}.pdf",
).copy_local_file(src_path=local_pdf_path)

except (subprocess.CalledProcessError, OSError) as err:
raise PDFConversionError from err
Comment on lines +383 to +384
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: il y a sûrement d'autres exceptions à catcher ici


finally:
# Remove the local files
if local_docx_path.exists():
os.remove(local_docx_path)
if local_pdf_path.exists():
os.remove(local_pdf_path)


def _to_fr_float(value, d=2):
Expand Down
16 changes: 7 additions & 9 deletions conventions/services/recapitulatif.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
from conventions.services import utils
from conventions.services.conventions import ConventionService
from conventions.services.file import ConventionFileService
from conventions.tasks import generate_and_send
from conventions.tasks import task_generate_and_send
from core.services import EmailService, EmailTemplateID
from programmes.models import Annexe, Programme
from siap.exceptions import SIAPException
Expand Down Expand Up @@ -507,14 +507,12 @@ def convention_validate(request: HttpRequest, convention: Convention):
convention.valide_le = datetime.date.today()
convention.save()

generate_and_send.delay(
{
"convention_uuid": str(convention.uuid),
"convention_url": request.build_absolute_uri(
reverse("conventions:preview", args=[convention.uuid])
),
"convention_email_validator": request.user.email,
}
task_generate_and_send.delay(
convention_uuid=str(convention.uuid),
convention_url=request.build_absolute_uri(
reverse("conventions:preview", args=[convention.uuid])
),
convention_email_validator=request.user.email,
)

return {
Expand Down
Loading
Loading