Skip to content

Releases: solrmarc/solrmarc_the_next_generation

Recognize XML files better

27 Sep 14:52
Compare
Choose a tag to compare

Old SolrMarc relied on the file suffix to determine whether an input file was XML. This newer version looks at the contents of the start of the file to make that determination. The tests it used were too narrowly specific. This patch checks a few more patterns at the start of the file to decide whether an input file is XML or not.

Note: the required jars haven't changed since the original v3.0.0

The simple_install_package consists of the solrmarc_core_X.X.X.jar plus all of the required jars, plus a few mostly empty directories set up in the way that SolrMarc expects.

First bug-patch release of SolrMarc

14 Sep 21:15
Compare
Choose a tag to compare

A minor bug-patch release, fixing an issue with calling custom methods from within a custom script.

First Release of completely re-written SolrMarc

13 Sep 20:58
Compare
Choose a tag to compare

This is the First Release of a completely re-written SolrMarc. It was started from code written by Oliver Obenland, who devised the design and wrote the initial code around which this re-write was built.

The goal of the design is a program which operates much the same as the earlier versions of SolrMarc, including being able to process index specifications that worked with previous versions and produce substantially the same Solr records. But with the further goals of operating much faster and supporting a richer superset of features in the index specification language.

Some of the speed-up achieved is due to "compiling" the index specification into an internal format once, and then applying each of those "compiled" Indexer objects to the MARC records being processed.
The "compiled" Indexer objects basic architecture consists of an Extractor to extract data from the MARC record being processed, zero of more Map objects to transform and filter that data, and a Collector object to control how the data should be added to the Solr Document.

Among the many changes in this new version are:

  • Much richer index specification language
  • Ability to compile user-supplied custom methods at runtime
  • Ability to run multiple indexers in threads in parallel
  • Ability to gather records and submit them to Solr in large chunks
  • Ability to run against multiple different versions of Solr without needing to be re-compiled

With all these changes some of the features of previous versions are no longer available/no longer needed:

  • The release will no longer being combined into a One-Jar file. There is now just a single normal jar file, plus a small list of dependencies.
  • The release no longer will support direct writing to Solr index files. It merely builds SolrInputDocument objects and sends them to an already running Solr Server.
  • The project will no longer include "example" builds. They will exist but they will be separate GitHub repositories