Skip to content

ArunaStorage/aruna

Rust License License CI Codecov dependency status


Aruna logo


Aruna Data Orchestration Engine

Aruna is a geo-redundant data orchestration engine that manages scientific data and a rich set of associated metadata according to FAIR principles.

It supports multiple data storage backends (e.g. S3, File ...) via data proxies that expose an S3-compatible interface. The main server handles metadata, user and resource hierarchies while the data proxies handle the data itself. Data proxies can communicate with each other in a peer-to-peer-like network and share data.

This repository is split into two components, the server and the data proxy.

Features

  • FAIR, geo-redundant, data storage for multiple scientific domains
  • A decentralized data storage system with a global catalog and authorization functionality provided by servers and a fully distributed network of data locations that sovereignly manage access.
  • Dedicated rule system to enforce custom policies for your project using Common Expression Language (CEL)
  • Data catalog for listing, searching, and viewing all available (meta)data
  • Unified access to data backends via the S3-compliant interface provided by data proxies
  • Proxies enforce sovereign domain- and location-specific rules and policies, leaving the final decision of who gets access to data to the data owners
  • Ingest existing data and automatically register it in the distributed catalog
  • Organize data as objects and group them into projects, collections, and datasets; link internal and external data in a sophisticated relationship graph
  • Flexible, file format and data structure and ontology independent metadata annotation via labels and dedicated metadata files (e.g. schema.org)
  • Full transparency via notification streams for all performed actions
  • Compatible with multiple (existing) backend data storage architectures (S3, File, ...)
  • S3-compatible API for pre-authenticated upload and download URLs
  • REST-API and dedicated client libraries for Python, Rust, Go and Java
  • Hook system to integrate external workflows for data validation and transformation and processing

Getting started

Aruna is build as a managed service for our scientific partners or as a self-deployed open source collection of components for your own needs. Visit aruna-storage.org to learn more about our managed data management service.

How to run a local test instance

To get started with a local instance for testing you first need to have docker/podman and docker-compose/podman-compose installed.

Start the needed containers:

curl -L https://demo.aruna-storage.org | docker compose -f - up

or download and run the local compose_deploy.yaml for yourself.

Caution

This deployment contains hard-coded credentials and is therefore NOT suitable for any production or public use, only use it for local testing

This will run the following pre-required components at ports:

Additionally this will also run the following aruna specific components:

  • Aruna Server :50051
  • Aruna Web :3000
  • Aruna REST Gateway :8080
  • Aruna Dataproxy_1 :50052 / :1337
  • Aruna Dataproxy_2 :50055 / :1338

Interacting with aruna

Test tokens and Website credentials for the local test deployment can be found in here.

A detailed user guide is found in the Documentation. For language specific details please visit our specific documentations:

TLDR:

License

The API is licensed under either of

at your option. Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion for Aruna by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Feedback & Contributions

If you have any ideas, suggestions, or issues, please don't hesitate to open an issue and/or PR. Contributions to this project are always welcome ! We appreciate your help in making this project better. Please have a look at our Contributor Guidelines as well as our Code of Conduct for more information.