Discovery service

Part of the value-added services suite for Topio

The discovery service is an application for dataset discovery with 3 components:

It exposes a series of services via REST API
It automatically ingests newly added datasets using a scheduler implemented with Celery. The scheduler can be configured using DATA_INGESTION_INTERVAL variable in .env-default. The default value is 60 seconds.
It provides services for Jupyter Notebook via a developed plugin: https://github.com/Archer6621/jupyterlab-daisy

Requirements

The entire project is containerised, therefore the only requirement is Docker

API Documentation

You can browse the full OpenAPI documentation.

How to run/use

The discovery service is available for both development and production.

Environment variables

The environement varibles can be found in .env-default

Always delete the auto-generated .env file after changing something in .env-default

DAISY_PRODUCTION - TRUE to run in production mode and FALSE to run in development mode. Default FALSE
DATA_INGESTION_INTERVAL - The time interval in SECONDS for starting the auto-ingest pipeline. The time interval should reflect how often new data is uploaded/received.
DATA_ROOT_PATH - The location of the datasets

Running

Run docker_start.sh to start Docker. Based on the DAISY_PRODUCTION variable, it will automatically use the appropriate docker-compose.

Visit the API Documentation via localhost:443 once the application is up.

Steps to ingest data:

Run /ingest-data endpoint.
1. The data should be in the data folder and it has to follow this structure: {id}/resources/{file-name}.csv
2. This endpoint will take a while to run. The more data to process, the more it will run.
Run /filter-connections to remove extra edges.

To remove all the data:

Run /purge. This will remove all the data from neo4j and redis.

Use the discovery service:

Get joinable tables - Get all assets that share a column(key) with the speficied asset /get-joinable with input: asset_id
Get related assets - Given a source and a target, show how and if the assets are connected /get-realted with 2 input variables from_asset_id and to_asset_id

Monitoring

(Development) The following admin-panels are exposed, for inspecting the services:

Rabbit MQ: localhost:15672
Neo4j: localhost:7474
Celery Flower: localhost:5555
Redis: localhost:8001

Development

You can edit any python file in the src folder with your favorite text editor and it will live-update while the container is running (and in case of the API, restart/reload automatically).

File sharing error

If you get an error about file sharing on windows, visit this thread.

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
backend		backend
data		data
docs		docs
jupyterlab-daisy @ 15f5843		jupyterlab-daisy @ 15f5843
minio		minio
.drone.yml		.drone.yml
.env-default		.env-default
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
check_config.sh		check_config.sh
copy_data.sh		copy_data.sh
docker-compose-dev.yml		docker-compose-dev.yml
docker-compose-prod.yml		docker-compose-prod.yml
docker-compose.yml		docker-compose.yml
docker_start.sh		docker_start.sh
purge_volumes.sh		purge_volumes.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Discovery service

Part of the value-added services suite for Topio

Requirements

API Documentation

How to run/use

Environment variables

Running

Steps to ingest data:

To remove all the data:

Use the discovery service:

Monitoring

Development

File sharing error

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

OpertusMundi/discovery-service

Folders and files

Latest commit

History

Repository files navigation

Discovery service

Part of the value-added services suite for Topio

Requirements

API Documentation

How to run/use

Environment variables

Running

Steps to ingest data:

To remove all the data:

Use the discovery service:

Monitoring

Development

File sharing error

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages