Skip to content

1.1.0

Latest
Compare
Choose a tag to compare
@doulikecookiedough doulikecookiedough released this 02 Oct 00:34
c11d5bc

HashStore 1.1.0 🎉

Release date: 2024-10-01

Release Notes

This minor HashStore release refactors the storage of data objects from persistent identifier-based to content identifier-based hashes with a tagging system, while optimizing thread safety, synchronization, readability, and logging.

Overview of Major Changes ⚙️

  • Data objects are now stored with their content identifier, and are managed with reference files to establish the relationship between its authority-based or persistent identifiers (pids) I-73
  • Clients can now store a data object without an identifier. They are then expected to call tag_object separately to create this connection between a data object and its identifier. Additionally, we recommend to call delete_if_invalid_object afterwards which will remove a data object that is determined to be invalid
  • Refactored delete_object to also remove all associated metadata for a given identifier and improved the atomicity of the process by first renaming the files before proceeding to delete I-87
  • Metadata (ex. sysmeta, annotations) are now stored with a document name formed by the hash of the pid+format_id and stored in a hashstore directory formed with the hash of the pid I-99

New Features & Enhancements 🛠️

  • New Public API methods: tag_object, delete_if_invalid_object and supporting methods & processes. tag_object creates reference files linking an identifier (ex. pid) to its content identifier. I-124, I-75, I-76, I-81, I-97, I-101, I-109, I-111, I-113, I-114, I-122, I-124
  • The hashstore.yaml config file content relating to the keys and values are now created with a .yaml library to ensure reliability of content written I-138
  • Misc. improvements to the hashstore client, along with a script entry point which is a part of the poetry install process which simplifies the syntax/client usage I-92, I-82, I-94, I-106
  • Enhanced the thread/process synchronization process with specific threading and mulitprocessing locks to address race conditions (improved pytest time to less than 2s!) I-98
  • Added thread safety to all public API calls when working with metadata objects I-99
  • Refactored ObjectMetadata is be a @dataclass I-126
  • Various bug fixes and optimizations to the overall codebase to improve overall readability and clarity to resolve linting warnings I-139, I-136, I-72, I-112, I-119, I-121, I-125, I-85
  • Revised python docstrings into reStructuredText (sphinx autodocumentation compatible format) and added type hints I-70, I-137
  • Cleaned up logging statements which now utilizes a logger object I-93, I-140, I-90