Skip to content

Releases: allenai/mmda

v0.5.0

01 Jun 21:11
33bbd81
Compare
Choose a tag to compare

New predictor for word prediction using an SVM

0.4.8

11 May 01:38
6ab9004
Compare
Choose a tag to compare

What's Changed

  • E2E eval nb to ai2_internal/evaluation_notebooks by @geli-gel in #240
  • Adding notebook with concurrent profiling calls to the service by @comorado in #241
  • Egork/egork/figure table/fix list to extend by @comorado in #245

Full Changelog: 0.4.7...0.4.8

0.3.0

09 Mar 23:16
213837c
Compare
Choose a tag to compare
  1. Added Recipes as a way to combine multiple Predictors & test how they stitch together. Currently added a CoreRecipe. Docs & Tests demonstrate how it can be used.

  2. Added Grobid integration as a way of augmenting an existing Document from a Parser (e.g. PDFPlumber) with Grobid annotations.

  3. Laying groundwork for a future dataclass called Relation.

0.2.82

06 Mar 19:16
3f90e9c
Compare
Choose a tag to compare
  1. Added Font info to PDFPlumber
  2. Added Section Nesting Predictor which relies on Font Info
  3. Add some bugfixes to BibPredictor and BibDetector predictors
  4. Cap PDFplumber dependency < 0.8.0 since they moved a module to a different place

0.2.7

09 Dec 00:48
da75fb5
Compare
Choose a tag to compare
  1. Add section header predictor (0.2.5)
  2. Add new logic to DictionaryWordPredictor to handle cases with singleton symbols per row (e.g. in a table) (0.2.6)
  3. Bugfixes to better handling of DWP (0.2.7)

0.2.4

23 Nov 22:37
56b715d
Compare
Choose a tag to compare
  • Added a Metadata as a type that can exist at a Document-level
  • Added utility for obtaining OutlineMetadata from a PDF
  • Fixes to citation_linker because of sklearn deprecation
  • Add WhiteSpaceTokenizer
  • Fixes to DictionaryWordPredictor because of change to how tokenization happens in PDFPlumberParser
  • Change how fieldnames are defined in types.names
  • Move off setup.py into pyproject.toml

0.1.0

21 Oct 18:54
Compare
Choose a tag to compare
  • Changes to Annotation class to remove uuid, require id, change Metadata default behavior
  • Changes to JSON serialization schema for Box
  • Bugfix in MentionDetector that was changing Document.tokens accidentally due to lack of deepcopy
  • Add new predictor for Table/Figure Captions
  • Hotfix in PDFPlumberParser that avoids injection of new whitespace in Document.symbols

0.0.44

05 Oct 01:30
8ffcc29
Compare
Choose a tag to compare
Add attributes to API data classes (#150)

* redesigned apis to account for metadata

* bumped version

* switched to attributes

* simplified code with extra=Extra.ignore

explicitly removing `id`, `text`, and `type` is no longer reuqired bc
they are automatically ignored.

* bumped version, suggestion from @cmwilhelm

0.0.29

31 Aug 00:38
c7cb515
Compare
Choose a tag to compare

What's Changed

  • is_overwrite was ignored in Document.annotate by @soldni in #128

Full Changelog: 0.0.28...0.0.29

0.0.28

29 Aug 22:53
ea03e45
Compare
Choose a tag to compare

What's Changed

  • Fixed issue with failing initialization when metadata is provided to span group by @soldni in #126
  • Angelez/bibentries by @geli-gel in #113

Full Changelog: 0.0.26.1...0.0.28