Skip to content

Releases: OHNLP/Backbone

Release v1.0.11

16 Feb 14:25

Choose a tag to compare

  • Beam version update to support spark 3.x

Release v1.0.10

04 Feb 16:47

Choose a tag to compare

  • Improved local run script allowing for interactive runtime selection of configuration to use instead of requiring script editing
    • Non-interactive run still possible by supplying configuration name as an argument to the script, e.g. ./run_pipeline_local.sh your_config_name.json
  • Script will now auto-package the flink jarfile on new runs

Release v1.0.9

03 Feb 21:24

Choose a tag to compare

Note: This release has significant changes to the pipeline packaging and execution. Please redownload in full and copy only the "configs", "modules", and "resources" folders over from your previous setup and rerun packaging.

Changes:

  • Local Direct Runner is removed due to performance issues and replaced with an embedded flink cluster. 'run_pipeline_local.sh' has been accordingly updated to set up and use this embedded flink cluster
    • Please follow instructions given during the install process. which will be automatically ran the first time 'run_pipeline_local.sh' is called.
  • Example configuration updated for PASC/RECOVER task. Please update your NLP run configurations accordingly

Release v1.0.8

03 Feb 18:49

Choose a tag to compare

Transient Release - use v1.0.9 Instead

Release v1.0.7

11 Jan 18:34

Choose a tag to compare

Release v1.0.7 Pre-release
Pre-release

Draft Release Autonomously Generated By CI

Release v1.0.6

10 Jan 05:25

Choose a tag to compare

Note: This release has significant changes to the pipeline packaging and execution. Please redownload in full and copy only the "configs", "modules", and "resources" folders over from your previous setup and rerun packaging.

Changes:

  • JDBC Read is now done in Parallel using OFFSET/FETCH (or equivalent depending on SQL dialect)
  • Pipeline options for setting parallel read optimizations to JDBCExtract have been added. Please refer to the new example configs for reference
    • Of particular note, because each parallel query must be sorted on the SQL server side (required to ensure result consistency across parallel queries), please ensure batch_size is reasonably large as memory permits. A reasonable start for textual narratives would be batch size of 10000
    • identifier_col parameter is highly recommended: this should be unique and numeric in nature if possible (although not required). Even better if this is indexed on the SQL side, as it will be used for sorting optimization. If parameter is not provided, backbone will default to sorting on all columns in column declaration order for result consistency, which may be slow.

Release v1.0.5

10 Jan 05:19

Choose a tag to compare

Release v1.0.5 Pre-release
Pre-release

Transitional/Draft Release. Do not Use

Release v1.0.4

09 Jan 18:32

Choose a tag to compare

Release v1.0.4 Pre-release
Pre-release

Transitional/Draft Release, Do not Use.

Release v1.0.3

10 Jun 20:01

Choose a tag to compare

Explicitly exit after main thread unblocks from pipeline completion to cleanup other hanging threads.

Release v1.0.2

25 Feb 06:01

Choose a tag to compare

  • Platform Specific Builds For:
    • Direct Local (Debugging)
    • Apache Flink
    • Google Cloud Platform Dataflow
    • Apache Spark
      • Standalone/Local Mode (v2.4.7)
      • Cluster Mode (v2.x)
  • Updated example configs