Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The compression effect of compress-pdf is not good, and the memory it occupies is not released #2506

Open
1 task done
Vincentauyeung opened this issue Dec 20, 2024 · 2 comments
Labels
needs investigation Issues that require further investigation

Comments

@Vincentauyeung
Copy link

Installation Method

Docker

The Problem

In version 0.36.4, the compression effect on files isn't significantly different. The same file had a compression ratio of 10% in version 0.33.1. Additionally, in version 0.36.4, the memory usage by Docker is relatively high and is not released after the task is completed, which can lead to failures when there are many tasks.

Version of Stirling-PDF

0.36.4

Last Working Version of Stirling-PDF

0.33.1

Page Where the Problem Occurred

No response

Docker Configuration

No response

Relevant Log Output

Load average: 0.01 0.01 0.00 2/731 555
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
   10     7 stirling S    21.7g  87%  10   0% java -Dfile.encoding=UTF-8 -jar /app.jar
  545     0 root     S     2620   0%  14   0% /bin/bash
    7     1 root     S     2320   0%   2   0% {init.sh} /bin/bash /scripts/init.sh java -Dfile.encoding=UTF-8 -jar /app.jar
  555   545 root     R     1628   0%   3   0% top
    1     0 root     S      848   0%   0   0% tini -- /scripts/init.sh java -Dfile.encoding=UTF-8 -jar /app.jar
fe0157f9c2ab:/#

08:06:43.306 [qtp1281580764-83] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=5 --compress-streams=y --object-streams=generate /tmp/input_14486430207505780250.pdf /tmp/output_7797567749138520850.pdf
08:15:42.426 [qtp1281580764-119] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=5 --compress-streams=y --object-streams=generate /tmp/input_6075460368366413101.pdf /tmp/output_2754503219077468122.pdf
08:25:39.981 [qtp1281580764-85] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=5 --compress-streams=y --object-streams=generate /tmp/input_7063201442132836996.pdf /tmp/output_17154865287089274309.pdf
08:25:58.112 [qtp1281580764-85] WARN  s.s.S.c.api.misc.CompressController - Optimized file is larger than the original. Returning the original file instead.
08:27:39.299 [qtp1281580764-67] WARN  o.apache.pdfbox.pdmodel.PDDocument - You are overwriting the existing file input_2216131900555302994.pdf, this will produce a corrupted file if you're also reading from it
08:27:39.878 [qtp1281580764-67] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=8 --compress-streams=y --object-streams=generate /tmp/input_2216131900555302994.pdf /tmp/output_6414094566086155025.pdf
08:27:39.993 [Thread-75] INFO  s.s.SPDF.utils.ProcessExecutor - WARNING: /tmp/input_2216131900555302994.pdf: reported number of objects (4191) is not one plus the highest object number (4189)
08:27:57.431 [Thread-75] INFO  s.s.SPDF.utils.ProcessExecutor - qpdf: operation succeeded with warnings; resulting file may have some problems
08:27:57.577 [qtp1281580764-67] WARN  s.s.S.c.api.misc.CompressController - Optimized file is larger than the original. Returning the original file instead.
08:52:43.399 [qtp1281580764-83] WARN  o.apache.pdfbox.pdmodel.PDDocument - You are overwriting the existing file input_17373107469744773985.pdf, this will produce a corrupted file if you're also reading from it
08:52:43.779 [qtp1281580764-83] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=8 --compress-streams=y --object-streams=generate /tmp/input_17373107469744773985.pdf /tmp/output_1443993403229931841.pdf
08:52:43.797 [Thread-88] INFO  s.s.SPDF.utils.ProcessExecutor - WARNING: /tmp/input_17373107469744773985.pdf: reported number of objects (4191) is not one plus the highest object number (4189)
08:53:00.789 [Thread-88] INFO  s.s.SPDF.utils.ProcessExecutor - qpdf: operation succeeded with warnings; resulting file may have some problems
08:53:01.013 [qtp1281580764-83] WARN  s.s.S.c.api.misc.CompressController - Optimized file is larger than the original. Returning the original file instead.
Copying original files without overwriting existing files
Running Stirling PDF with DOCKER_ENABLE_SECURITY=false and VERSION_TAG=0.36.4
Setting permissions and ownership for necessary directories...
Picked up JAVA_TOOL_OPTIONS:  -XX:MaxRAMPercentage=75
 ____ _____ ___ ____  _     ___ _   _  ____       ____  ____  _____
/ ___|_   _|_ _|  _ \| |   |_ _| \ | |/ ___|     |  _ \|  _ \|  ___|
\___ \ | |  | || |_) | |    | ||  \| | |  _ _____| |_) | | | | |_
 ___) || |  | ||  _ <| |___ | || |\  | |_| |_____|  __/| |_| |  _|
|____/ |_| |___|_| \_\_____|___|_| \_|\____|     |_|   |____/|_|
Powered by Spring Boot 3.4.0
09:04:10.919 [main] INFO  s.software.SPDF.SPdfApplication - Starting SPdfApplication v0.36.4 using Java 21.0.5 with PID 10 (/app.jar started by stirlingpdfuser in /)
09:04:10.923 [main] INFO  s.software.SPDF.SPdfApplication - The following 1 profile is active: "default"
09:04:17.398 [main] INFO  s.software.SPDF.SPdfApplication - Running configs ApplicationProperties(legal=ApplicationProperties.Legal(termsAndConditions=https://www.stirlingpdf.com/terms-and-conditions, privacyPolicy=https://www.stirlingpdf.com/privacy-policy, accessibilityStatement=, cookiePolicy=, impressum=), security=ApplicationProperties.Security(enableLogin=false, csrfDisabled=true, initialLogin=ApplicationProperties.Security.InitialLogin(username=), oauth2=ApplicationProperties.Security.OAUTH2(enabled=false, issuer=, clientId=, autoCreateUser=false, blockRegistration=false, useAsUsername=email, scopes=[openid, profile, email], provider=google, client=ApplicationProperties.Security.OAUTH2.Client(google=Google [clientId=, clientSecret=NULL, scopes=[https://www.googleapis.com/auth/userinfo.email, https://www.googleapis.com/auth/userinfo.profile], useAsUsername=email], github=GitHub [clientId=, clientSecret=NULL, scopes=[read:user], useAsUsername=login], keycloak=Keycloak [issuer=, clientId=, clientSecret=NULL, scopes=[openid, profile, email], useAsUsername=preferred_username])), saml2=stirling.software.SPDF.model.ApplicationProperties$Security$SAML2@5fdb7394, loginAttemptCount=5, loginResetTimeMinutes=120, loginMethod=all, customGlobalAPIKey=null), system=ApplicationProperties.System(defaultLocale=zh_CN, googlevisibility=false, showUpdate=false, showUpdateOnlyAdmin=false, customHTMLFiles=false, tessdataDir=/usr/share/tessdata, enableAlphaFunctionality=false, enableAnalytics=false), ui=ApplicationProperties.Ui(appName=H-PDF, homeDescription=null, appNameNavbar=null), endpoints=ApplicationProperties.Endpoints(toRemove=[pipeline], groupsToRemove=[]), metrics=ApplicationProperties.Metrics(enabled=true), automaticallyGenerated=ApplicationProperties.AutomaticallyGenerated(UUID=831afeab-f32b-4994-87ba-93e0aa920a0f, appVersion=0.36.4), enterpriseEdition=ApplicationProperties.EnterpriseEdition(enabled=false, maxUsers=0, customMetadata=ApplicationProperties.EnterpriseEdition.CustomMetadata(autoUpdateMetadata=false, author=username, creator=Stirling-PDF, producer=Stirling-PDF)), autoPipeline=ApplicationProperties.AutoPipeline(outputFolder=null), processExecutor=ApplicationProperties.ProcessExecutor(sessionLimit=ApplicationProperties.ProcessExecutor.SessionLimit(libreOfficeSessionLimit=1, pdfToHtmlSessionLimit=1, pythonOpenCvSessionLimit=8, weasyPrintSessionLimit=16, installAppSessionLimit=1, calibreSessionLimit=1, qpdfSessionLimit=4, tesseractSessionLimit=1), timeoutMinutes=ApplicationProperties.ProcessExecutor.TimeoutMinutes(libreOfficeTimeoutMinutes=30, pdfToHtmlTimeoutMinutes=20, pythonOpenCvTimeoutMinutes=30, weasyPrintTimeoutMinutes=30, installAppTimeoutMinutes=60, calibreTimeoutMinutes=30, tesseractTimeoutMinutes=30, qpdfTimeoutMinutes=30)))
09:04:23.866 [main] INFO  s.s.S.config.EndpointConfiguration - Total disabled endpoints: 4. Disabled endpoints: book-to-pdf, pdf-to-book, pdf-to-pdfa, pipeline
Invalid APP_LOCALE environment variable value. Falling back to default Locale.UK.
09:04:28.990 [main] INFO  s.software.SPDF.SPdfApplication - Started SPdfApplication in 20.543 seconds (process running for 24.775)
09:04:29.014 [scheduling-1] WARN  s.software.SPDF.utils.FileMonitor - not monitoring any directory, even the root directory itself: ./pipeline/watchedFolders
09:04:29.016 [scheduling-1] INFO  s.software.SPDF.utils.FileMonitor - Registered directory: ./pipeline/watchedFolders
09:04:29.020 [main] INFO  s.software.SPDF.SPdfApplication - Stirling-PDF Started.
09:04:29.020 [main] INFO  s.software.SPDF.SPdfApplication - Navigate to http://localhost:8080
09:17:02.012 [qtp1402479907-51] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=5 --compress-streams=y --object-streams=generate /tmp/input_15453331872551591584.pdf /tmp/output_10424284958091912330.pdf
09:18:23.687 [qtp1402479907-75] WARN  o.e.jetty.io.AbstractConnection - Failed callback
java.io.IOException: Insufficient content written 1079345152 < 1153490315
        at org.eclipse.jetty.ee10.servlet.ServletChannel.sendErrorOrAbort(ServletChannel.java:615)
        at org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:570)
        at org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:464)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:575)
        at org.eclipse.jetty.ee10.servlet.SessionHandler.handle(SessionHandler.java:717)
        at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1060)
        at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
        at org.eclipse.jetty.server.handler.EventsHandler.handle(EventsHandler.java:81)
        at org.eclipse.jetty.server.Server.handle(Server.java:182)
        at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:662)
        at org.eclipse.jetty.server.internal.HttpConnection.onFillable(HttpConnection.java:418)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99)
        at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:478)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:441)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:293)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)
        at java.base/java.lang.Thread.run(Thread.java:1583)
09:34:09.516 [qtp1402479907-84] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=5 --compress-streams=y --object-streams=generate /tmp/input_11759335196431308915.pdf /tmp/output_2497789041052486310.pdf
09:38:04.025 [qtp1402479907-84] WARN  s.s.S.c.api.misc.CompressController - Optimized file is larger than the original. Returning the original file instead.
09:51:18.831 [qtp1402479907-72] WARN  o.apache.pdfbox.pdmodel.PDDocument - You are overwriting the existing file input_9929772336105117985.pdf, this will produce a corrupted file if you're also reading from it
09:51:55.983 [qtp1402479907-72] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=9 --compress-streams=y --object-streams=generate /tmp/input_9929772336105117985.pdf /tmp/output_3871698498268098119.pdf
09:51:56.155 [Thread-46] INFO  s.s.SPDF.utils.ProcessExecutor - WARNING: /tmp/input_9929772336105117985.pdf: reported number of objects (158488) is not one plus the highest object number (158486)
09:59:43.868 [Thread-46] INFO  s.s.SPDF.utils.ProcessExecutor - qpdf: operation succeeded with warnings; resulting file may have some problems
09:59:47.817 [qtp1402479907-72] WARN  s.s.S.c.api.misc.CompressController - Optimized file is larger than the original. Returning the original file instead.
Copying original files without overwriting existing files
Running Stirling PDF with DOCKER_ENABLE_SECURITY=false and VERSION_TAG=0.36.4
Setting permissions and ownership for necessary directories...
Picked up JAVA_TOOL_OPTIONS:  -XX:MaxRAMPercentage=75
 ____ _____ ___ ____  _     ___ _   _  ____       ____  ____  _____
/ ___|_   _|_ _|  _ \| |   |_ _| \ | |/ ___|     |  _ \|  _ \|  ___|
\___ \ | |  | || |_) | |    | ||  \| | |  _ _____| |_) | | | | |_
 ___) || |  | ||  _ <| |___ | || |\  | |_| |_____|  __/| |_| |  _|
|____/ |_| |___|_| \_\_____|___|_| \_|\____|     |_|   |____/|_|
Powered by Spring Boot 3.4.0
01:03:59.670 [main] INFO  s.software.SPDF.SPdfApplication - Starting SPdfApplication v0.36.4 using Java 21.0.5 with PID 10 (/app.jar started by stirlingpdfuser in /)
01:03:59.705 [main] INFO  s.software.SPDF.SPdfApplication - The following 1 profile is active: "default"
01:04:05.640 [main] INFO  s.software.SPDF.SPdfApplication - Running configs ApplicationProperties(legal=ApplicationProperties.Legal(termsAndConditions=https://www.stirlingpdf.com/terms-and-conditions, privacyPolicy=https://www.stirlingpdf.com/privacy-policy, accessibilityStatement=, cookiePolicy=, impressum=), security=ApplicationProperties.Security(enableLogin=false, csrfDisabled=true, initialLogin=ApplicationProperties.Security.InitialLogin(username=), oauth2=ApplicationProperties.Security.OAUTH2(enabled=false, issuer=, clientId=, autoCreateUser=false, blockRegistration=false, useAsUsername=email, scopes=[openid, profile, email], provider=google, client=ApplicationProperties.Security.OAUTH2.Client(google=Google [clientId=, clientSecret=NULL, scopes=[https://www.googleapis.com/auth/userinfo.email, https://www.googleapis.com/auth/userinfo.profile], useAsUsername=email], github=GitHub [clientId=, clientSecret=NULL, scopes=[read:user], useAsUsername=login], keycloak=Keycloak [issuer=, clientId=, clientSecret=NULL, scopes=[openid, profile, email], useAsUsername=preferred_username])), saml2=stirling.software.SPDF.model.ApplicationProperties$Security$SAML2@6248cfab, loginAttemptCount=5, loginResetTimeMinutes=120, loginMethod=all, customGlobalAPIKey=null), system=ApplicationProperties.System(defaultLocale=zh_CN, googlevisibility=false, showUpdate=false, showUpdateOnlyAdmin=false, customHTMLFiles=false, tessdataDir=/usr/share/tessdata, enableAlphaFunctionality=false, enableAnalytics=false), ui=ApplicationProperties.Ui(appName=H-PDF, homeDescription=null, appNameNavbar=null), endpoints=ApplicationProperties.Endpoints(toRemove=[pipeline], groupsToRemove=[]), metrics=ApplicationProperties.Metrics(enabled=true), automaticallyGenerated=ApplicationProperties.AutomaticallyGenerated(UUID=831afeab-f32b-4994-87ba-93e0aa920a0f, appVersion=0.36.4), enterpriseEdition=ApplicationProperties.EnterpriseEdition(enabled=false, maxUsers=0, customMetadata=ApplicationProperties.EnterpriseEdition.CustomMetadata(autoUpdateMetadata=false, author=username, creator=Stirling-PDF, producer=Stirling-PDF)), autoPipeline=ApplicationProperties.AutoPipeline(outputFolder=null), processExecutor=ApplicationProperties.ProcessExecutor(sessionLimit=ApplicationProperties.ProcessExecutor.SessionLimit(libreOfficeSessionLimit=1, pdfToHtmlSessionLimit=1, pythonOpenCvSessionLimit=8, weasyPrintSessionLimit=16, installAppSessionLimit=1, calibreSessionLimit=1, qpdfSessionLimit=4, tesseractSessionLimit=1), timeoutMinutes=ApplicationProperties.ProcessExecutor.TimeoutMinutes(libreOfficeTimeoutMinutes=30, pdfToHtmlTimeoutMinutes=20, pythonOpenCvTimeoutMinutes=30, weasyPrintTimeoutMinutes=30, installAppTimeoutMinutes=60, calibreTimeoutMinutes=30, tesseractTimeoutMinutes=30, qpdfTimeoutMinutes=30)))
01:04:11.803 [main] INFO  s.s.S.config.EndpointConfiguration - Total disabled endpoints: 4. Disabled endpoints: book-to-pdf, pdf-to-book, pdf-to-pdfa, pipeline
Invalid APP_LOCALE environment variable value. Falling back to default Locale.UK.
01:04:16.773 [main] INFO  s.software.SPDF.SPdfApplication - Started SPdfApplication in 19.413 seconds (process running for 22.637)
01:04:16.804 [scheduling-1] WARN  s.software.SPDF.utils.FileMonitor - not monitoring any directory, even the root directory itself: ./pipeline/watchedFolders
01:04:16.806 [scheduling-1] INFO  s.software.SPDF.utils.FileMonitor - Registered directory: ./pipeline/watchedFolders
01:04:16.813 [main] INFO  s.software.SPDF.SPdfApplication - Stirling-PDF Started.
01:04:16.814 [main] INFO  s.software.SPDF.SPdfApplication - Navigate to http://localhost:8080
Copying original files without overwriting existing files
Running Stirling PDF with DOCKER_ENABLE_SECURITY=false and VERSION_TAG=0.36.4
Setting permissions and ownership for necessary directories...
Picked up JAVA_TOOL_OPTIONS:  -XX:MaxRAMPercentage=75
 ____ _____ ___ ____  _     ___ _   _  ____       ____  ____  _____
/ ___|_   _|_ _|  _ \| |   |_ _| \ | |/ ___|     |  _ \|  _ \|  ___|
\___ \ | |  | || |_) | |    | ||  \| | |  _ _____| |_) | | | | |_
 ___) || |  | ||  _ <| |___ | || |\  | |_| |_____|  __/| |_| |  _|
|____/ |_| |___|_| \_\_____|___|_| \_|\____|     |_|   |____/|_|
Powered by Spring Boot 3.4.0
01:13:27.117 [main] INFO  s.software.SPDF.SPdfApplication - Starting SPdfApplication v0.36.4 using Java 21.0.5 with PID 10 (/app.jar started by stirlingpdfuser in /)
01:13:27.122 [main] INFO  s.software.SPDF.SPdfApplication - The following 1 profile is active: "default"
01:13:32.105 [main] INFO  s.software.SPDF.SPdfApplication - Running configs ApplicationProperties(legal=ApplicationProperties.Legal(termsAndConditions=https://www.stirlingpdf.com/terms-and-conditions, privacyPolicy=https://www.stirlingpdf.com/privacy-policy, accessibilityStatement=, cookiePolicy=, impressum=), security=ApplicationProperties.Security(enableLogin=false, csrfDisabled=true, initialLogin=ApplicationProperties.Security.InitialLogin(username=), oauth2=ApplicationProperties.Security.OAUTH2(enabled=false, issuer=, clientId=, autoCreateUser=false, blockRegistration=false, useAsUsername=email, scopes=[openid, profile, email], provider=google, client=ApplicationProperties.Security.OAUTH2.Client(google=Google [clientId=, clientSecret=NULL, scopes=[https://www.googleapis.com/auth/userinfo.email, https://www.googleapis.com/auth/userinfo.profile], useAsUsername=email], github=GitHub [clientId=, clientSecret=NULL, scopes=[read:user], useAsUsername=login], keycloak=Keycloak [issuer=, clientId=, clientSecret=NULL, scopes=[openid, profile, email], useAsUsername=preferred_username])), saml2=stirling.software.SPDF.model.ApplicationProperties$Security$SAML2@5fdb7394, loginAttemptCount=5, loginResetTimeMinutes=120, loginMethod=all, customGlobalAPIKey=null), system=ApplicationProperties.System(defaultLocale=zh_CN, googlevisibility=false, showUpdate=false, showUpdateOnlyAdmin=false, customHTMLFiles=false, tessdataDir=/usr/share/tessdata, enableAlphaFunctionality=false, enableAnalytics=false), ui=ApplicationProperties.Ui(appName=H-PDF, homeDescription=null, appNameNavbar=null), endpoints=ApplicationProperties.Endpoints(toRemove=[pipeline], groupsToRemove=[]), metrics=ApplicationProperties.Metrics(enabled=true), automaticallyGenerated=ApplicationProperties.AutomaticallyGenerated(UUID=831afeab-f32b-4994-87ba-93e0aa920a0f, appVersion=0.36.4), enterpriseEdition=ApplicationProperties.EnterpriseEdition(enabled=false, maxUsers=0, customMetadata=ApplicationProperties.EnterpriseEdition.CustomMetadata(autoUpdateMetadata=false, author=username, creator=Stirling-PDF, producer=Stirling-PDF)), autoPipeline=ApplicationProperties.AutoPipeline(outputFolder=null), processExecutor=ApplicationProperties.ProcessExecutor(sessionLimit=ApplicationProperties.ProcessExecutor.SessionLimit(libreOfficeSessionLimit=1, pdfToHtmlSessionLimit=1, pythonOpenCvSessionLimit=8, weasyPrintSessionLimit=16, installAppSessionLimit=1, calibreSessionLimit=1, qpdfSessionLimit=4, tesseractSessionLimit=1), timeoutMinutes=ApplicationProperties.ProcessExecutor.TimeoutMinutes(libreOfficeTimeoutMinutes=30, pdfToHtmlTimeoutMinutes=20, pythonOpenCvTimeoutMinutes=30, weasyPrintTimeoutMinutes=30, installAppTimeoutMinutes=60, calibreTimeoutMinutes=30, tesseractTimeoutMinutes=30, qpdfTimeoutMinutes=30)))
01:13:32.679 [main] INFO  s.s.S.config.EndpointConfiguration - Total disabled endpoints: 4. Disabled endpoints: book-to-pdf, pdf-to-book, pdf-to-pdfa, pipeline
Invalid APP_LOCALE environment variable value. Falling back to default Locale.UK.
01:13:36.460 [main] INFO  s.software.SPDF.SPdfApplication - Started SPdfApplication in 10.903 seconds (process running for 12.847)
01:13:36.467 [scheduling-1] WARN  s.software.SPDF.utils.FileMonitor - not monitoring any directory, even the root directory itself: ./pipeline/watchedFolders
01:13:36.471 [scheduling-1] INFO  s.software.SPDF.utils.FileMonitor - Registered directory: ./pipeline/watchedFolders
01:13:36.481 [main] INFO  s.software.SPDF.SPdfApplication - Stirling-PDF Started.
01:13:36.482 [main] INFO  s.software.SPDF.SPdfApplication - Navigate to http://localhost:8080
02:18:19.248 [qtp1402479907-49] WARN  o.apache.pdfbox.pdmodel.PDDocument - You are overwriting the existing file input_1709220522492119263.pdf, this will produce a corrupted file if you're also reading from it
02:18:30.831 [qtp1402479907-49] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=8 --compress-streams=y --object-streams=generate /tmp/input_1709220522492119263.pdf /tmp/output_8417684886762377398.pdf
02:18:30.919 [Thread-29] INFO  s.s.SPDF.utils.ProcessExecutor - WARNING: /tmp/input_1709220522492119263.pdf: reported number of objects (4826) is not one plus the highest object number (4824)
02:18:46.578 [Thread-29] INFO  s.s.SPDF.utils.ProcessExecutor - qpdf: operation succeeded with warnings; resulting file may have some problems
03:12:35.565 [qtp1402479907-83] WARN  o.apache.pdfbox.pdmodel.PDDocument - You are overwriting the existing file input_18322339359892560310.pdf, this will produce a corrupted file if you're also reading from it
03:12:42.620 [qtp1402479907-83] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=9 --compress-streams=y --object-streams=generate /tmp/input_18322339359892560310.pdf /tmp/output_475909062257444646.pdf
03:12:43.179 [Thread-35] INFO  s.s.SPDF.utils.ProcessExecutor - WARNING: /tmp/input_18322339359892560310.pdf: reported number of objects (4826) is not one plus the highest object number (4824)
03:12:52.797 [Thread-35] INFO  s.s.SPDF.utils.ProcessExecutor - qpdf: operation succeeded with warnings; resulting file may have some problems
03:12:53.032 [qtp1402479907-83] INFO  s.s.S.c.api.misc.CompressController - Current compression ratio: 5.29
03:25:00.433 [qtp1402479907-83] WARN  o.apache.pdfbox.pdmodel.PDDocument - You are overwriting the existing file input_18322339359892560310.pdf, this will produce a corrupted file if you're also reading from it
03:25:02.694 [qtp1402479907-83] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=9 --compress-streams=y --object-streams=generate /tmp/input_18322339359892560310.pdf /tmp/output_475909062257444646.pdf
03:25:02.965 [Thread-37] INFO  s.s.SPDF.utils.ProcessExecutor - WARNING: /tmp/input_18322339359892560310.pdf: reported number of objects (5637) is not one plus the highest object number (5635)
03:25:06.790 [Thread-37] INFO  s.s.SPDF.utils.ProcessExecutor - qpdf: operation succeeded with warnings; resulting file may have some problems
03:25:06.834 [qtp1402479907-83] INFO  s.s.S.c.api.misc.CompressController - Current compression ratio: 1.40
03:28:21.709 [qtp1402479907-83] WARN  o.apache.pdfbox.pdmodel.PDDocument - You are overwriting the existing file input_18322339359892560310.pdf, this will produce a corrupted file if you're also reading from it
03:28:22.324 [qtp1402479907-83] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=9 --compress-streams=y --object-streams=generate /tmp/input_18322339359892560310.pdf /tmp/output_475909062257444646.pdf
03:28:22.371 [Thread-39] INFO  s.s.SPDF.utils.ProcessExecutor - WARNING: /tmp/input_18322339359892560310.pdf: reported number of objects (6448) is not one plus the highest object number (6446)
03:28:23.985 [Thread-39] INFO  s.s.SPDF.utils.ProcessExecutor - qpdf: operation succeeded with warnings; resulting file may have some problems
03:50:36.465 [qtp1402479907-54] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=2 --compress-streams=y --object-streams=generate /tmp/input_3818756845783608715.pdf /tmp/output_11630587244097675436.pdf
03:56:58.371 [qtp1402479907-66] INFO  s.s.SPDF.utils.ProcessExecutor - Running command: qpdf --normalize-content=y --optimize-images --recompress-flate --compression-level=2 --compress-streams=y --object-streams=generate /tmp/input_9519703529233364227.pdf /tmp/output_11004865621880217967.pdf
03:58:35.268 [qtp1402479907-70] WARN  o.e.jetty.io.AbstractConnection - Failed callback
java.io.IOException: Insufficient content written 1079803904 < 1153490315
        at org.eclipse.jetty.ee10.servlet.ServletChannel.sendErrorOrAbort(ServletChannel.java:615)
        at org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:570)
        at org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:464)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:575)
        at org.eclipse.jetty.ee10.servlet.SessionHandler.handle(SessionHandler.java:717)
        at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1060)
        at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
        at org.eclipse.jetty.server.handler.EventsHandler.handle(EventsHandler.java:81)
        at org.eclipse.jetty.server.Server.handle(Server.java:182)
        at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:662)
        at org.eclipse.jetty.server.internal.HttpConnection.onFillable(HttpConnection.java:418)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99)
        at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:478)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:441)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:293)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)
        at java.base/java.lang.Thread.run(Thread.java:1583)

Additional Information

No response

Browsers Affected

No response

No Duplicate of the Issue

  • I have verified that there are no existing issues raised related to my problem.
@dosubot dosubot bot added the needs investigation Issues that require further investigation label Dec 20, 2024
Copy link

dosubot bot commented Dec 20, 2024

@Vincentauyeung Hello there! I'm here to help you with any issues, answer questions, and guide you on how to become a contributor. If you need assistance, feel free to ask!

There is a similar open issue regarding high RAM usage in Stirling-PDF. The suggested solution involves configuring memory limits in Docker. For instance, setting a memory limit of 1GB in the docker-compose configuration can help manage RAM usage. Additionally, increasing the memory limit from 2GB to 4GB was suggested as a solution in a related issue [1].

Regarding the compression process, the qpdf command is used to optimize PDF files by applying various compression techniques, such as normalizing content, optimizing images, and recompressing Flate streams. The command also adjusts the compression level dynamically based on the desired output size [2]. If the optimized file is larger than the original, Stirling-PDF returns the original file instead [2].

For the memory not being released after tasks, you might want to ensure that temporary files are properly cleaned up. Stirling-PDF handles temporary files by deleting them after the process completes, which should help in managing memory usage [3][4].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@Frooodle
Copy link
Member

I see you inputted multiple pdfs and that some errored
Are you able to share any pdfs that had this error or what caused it in your logs?

Additionally did all pdfs compress badly that you tried , even compression level #10 etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation Issues that require further investigation
Projects
None yet
Development

No branches or pull requests

2 participants