The scystream project is an open-source data-science pipeline toolkit containing all necessary tools to create and carry our data-science workflows. With an easy to use frontend, you can schedule and deploy custom workflows containing different data processing tasks.
Its recommended to use docker and docker-compose
To setup all services just run the following command in the root directory
docker compose up -d
If your compute blocks depend on private Git repositories or Docker images, ensure proper authentication is in place.
Make sure that the user executing docker compose up
has ssh access to the required git-repos.
We are mounting the hosts ssh-agent to the core container.
For private Docker registries (e.g., GitLab):
- Run
docker login registry.gitlab.com
to create the~/.docker/config.json
.
We are mounting the .docker
directory to the airflow containers.
You can find the development READMEs in the according directories