HashStore 1.1.0 🎉
Release date: 2024-10-01
Release Notes
This minor HashStore release refactors the storage of data objects from persistent identifier-based to content identifier-based hashes with a tagging system, while optimizing thread safety, synchronization, readability, and logging.
Overview of Major Changes ⚙️
- Data objects are now stored with their content identifier, and are managed with reference files to establish the relationship between its authority-based or persistent identifiers (pids) I-73
- Clients can now store a data object without an identifier. They are then expected to call
tag_object
separately to create this connection between a data object and its identifier. Additionally, we recommend to calldelete_if_invalid_object
afterwards which will remove a data object that is determined to be invalid - Refactored
delete_object
to also remove all associated metadata for a given identifier and improved the atomicity of the process by first renaming the files before proceeding to delete I-87 - Metadata (ex. sysmeta, annotations) are now stored with a document name formed by the hash of the
pid+format_id
and stored in a hashstore directory formed with the hash of thepid
I-99
New Features & Enhancements 🛠️
- New Public API methods:
tag_object
,delete_if_invalid_object
and supporting methods & processes.tag_object
creates reference files linking an identifier (ex. pid) to its content identifier. I-124, I-75, I-76, I-81, I-97, I-101, I-109, I-111, I-113, I-114, I-122, I-124 - The
hashstore.yaml
config file content relating to the keys and values are now created with a.yaml
library to ensure reliability of content written I-138 - Misc. improvements to the hashstore client, along with a script entry point which is a part of the
poetry install
process which simplifies the syntax/client usage I-92, I-82, I-94, I-106 - Enhanced the thread/process synchronization process with specific threading and mulitprocessing locks to address race conditions (improved pytest time to less than 2s!) I-98
- Added thread safety to all public API calls when working with metadata objects I-99
- Refactored
ObjectMetadata
is be a@dataclass
I-126 - Various bug fixes and optimizations to the overall codebase to improve overall readability and clarity to resolve linting warnings I-139, I-136, I-72, I-112, I-119, I-121, I-125, I-85
- Revised python docstrings into reStructuredText (
sphinx
autodocumentation compatible format) and added type hints I-70, I-137 - Cleaned up logging statements which now utilizes a logger object I-93, I-140, I-90