Releases: bottomless-archive-project/library-of-alexandria
1.7.1
1.7.0
For a full changelog between 1.7.0 and 1.6.3 please check the changelogs of the release candidates and milestones.
The documentation for this release is available here.
Please update the Elasticsearch version used to 8.1.x if you index the archived documents and updating from 1.6.x!
Features
- #378 The debug screen does a trim (remove spaces on the beginning and at the end) on the document id before querying the Web Application for the document information.
Bugfixes
- #380 Increased the timeouts in the vault client. This will fix the regression in 1.7.0-rc1 that caused frequent timeout issues in the Web Application.
- #382 The 1.7.0-rc1 introduced a regression that disabled the parallelism in the Indexer Application. This made the application only index one document at a time. The parallelism was re-introduced and now the level of parallelism can be set with the
loa.indexer.parallelism
parameter. - #379 When one PDF document was already open on the UI and the user opened a second one, the first was reloaded for no reason. When the user had a lot of documents open already, reloading all of them put significant extra strain on the Web Application and on the browser. This was fixed.
- #377 The NetworkAddressCalculator only checked the first interface for network addresses when registering the application with the Conductor Application. In certain cases, this was not sufficient. Now the application checks all of the network interfaces to figure out the IP address.
Documentation
- #385 Reviewed and actualized the documentation. Many chapters were updated and a couple of new ones were added as well.
1.7.0-release-candidate.1
This is a pre-release. Please don't use it in production yet! Certain features might not work as intended, data loss could occur, etc.
The documentation for this release is available here.
Please update the Elasticsearch version used to 8.1.x if you index the archived documents!
Features
- #371 Re-written the applications from a reactive stack to a good ole' imperative one. This took up the majority of the development time and might look like a step backward. We found out that albeit the reactive stack has a bit better performance but made the development significantly harder and also introduced a lot of bugs that otherwise could have been avoided easily. Also, the virtual threads will be released to java in Project Loom hopefully soon, so we will get back the lost performance anyways.
- #372 Upgraded Elasticsearch to 8.1.2. This forced us to rewrite almost all of our logic that used Elasticsearch to use the low-level client. This was a breaking change that we had to integrate to keep up with the newest changes and performance improvements in Elasticsearch.
- #354 Added a new application called the Conductor Application. It is responsible to provide service discovery for the rest of the suite. It should be started first, so other apps can connect to it to figure out where they can reach the database, each other, etc. This is the only application where you should from now on configure the MongoDB and Elasticsearch connection properties.
- #358 The pool size for the queue now can be set via configuration. The default value (10) should be sufficient for the majority of the setups.
- #371 Upgraded the Web Application to show the availability of the other applications. This info is queried from the Conductor Application.
Maintenance
- #347 The application suite uses immutable java records to read the runtime configurations.
1.6.3
1.6.2
1.6.1
1.6.0
The documentation for this release is available here.
Features
- #279 The document's original location is saved into the database. If duplicates are found for the document, then the new locations are saved as well.
- #338 Added a command into the Administrator Application that is able to recover a corrupt document if the document is still available at any of the saved source locations.
- #343 Added a flag to the Downloader Application, that lets the application delete the files from the folder when the download source type is
folder
.
Bugfixes
- #346 Fixed an error that happened randomly within the Vault Application (ActiveMQLargeMessageInterruptedException).
Maintenance
- #345 Updated Spring Boot to version 2.6.0.
1.5.1
The documentation for this release is available here.
Bugfixes
- #348 The URLs that were generated by the Generator Application were occasionally double encoded using URL encoding.
- #342 The Web Application initialized Lingua but doesn't use it. This increased memory usage significantly.
- #341 The Elasticsearch instance was only accessible from localhost or 127.0.0.1.
1.5.0
The documentation for this release is available here.
Features
- #324 Added a way to set the result size on the search page. The default is still 10, but an option exists for 25, 50, and a 100 results per page as well.
- #323 Removed the Common Crawl based URL generation. Added hints to the documentation to use the url-collector or the document-location-database projects as replacement.
Bugfixes
- #337 Fixed that occasionally the "in vault" and "in index" parameters on the debug screen were not shown or be invisible.
- #307 Fixed that the search bar was searching with one-character words and doesn't wait for typing to be completed.
- #335 Fixed that the .pptx file type was missing from the search page.
- #334 Fixed that changing the document type doesn't reset the page count on the search page.
- #333 Fixed that some PDF's display image were incorrectly rendered and JPEG errors were shown up in the logs.
- #331 Fixed that the language were shown incorrectly on the search page when the document had no title or subtitle.
Maintenance
- #336 Updated Angular to version 12.
- #296 Updated Twitter Bootstrap to version 5.
- #328 Updated Elasticsearch to version 7.15.
- #330 Updated Apache Artemis to version 2.19.0.
- #329 Updated the Spring Boot version to 2.5.6 and updated other minor library versions too.
- #327 Renamed the loa-service module to loa-library. The project is also moved into a new package.
1.4.0
The documentation for this release is available here.
The Java version should be upgraded to 17 for this release to run.
Because of the removal of the INDEXING_FAILURE status, if you started document collecting before 1.2.0-milestone.1 then you MUST use the cleanup command in the Administrator Application (v1.3.0) to remove documents from the database that has this status.
Features
- #316 Added a pre-rendered image of the first page of PDF documents to the search page in the Web Application. At the moment these images are only shown for PDF documents only because of technical limitations.
- #310 Updated the debug endpoint to report if the document is available in the filesystem of the vault where it resides and if the document is indexed, is it available in Elasticsearch as a document in the index.
Maintenance
- #320 Updated Java to 17.
- #302 Updated the supported MongoDB version to 5.0. The 4.4 version should work as well, but it will not be officially tested in the future.
- #314 Removed the INDEXING_FAILURE status.
- #317 #305 Updated Apache Tika to 2.1. This is a major version upgrade. It significantly lowers the file size for the Downloader Application and Indexer Application.
- #312 Updated Apache Artemis to 2.18.0.