Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
BerndDoser committed Sep 17, 2024
0 parents commit 3e8876f
Show file tree
Hide file tree
Showing 14 changed files with 159 additions and 0 deletions.
28 changes: 28 additions & 0 deletions .github/workflows/quarto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
on:
push:
branches: main

name: Render

jobs:
build-deploy:
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v3

- name: Setup Quarto
uses: quarto-dev/quarto-actions/setup@v2
with:
tinytex: true

- name: Quarto version
run: |
quarto --version
- name: Publish to GitHub Pages (and render)
uses: quarto-dev/quarto-actions/publish@v2
with:
target: gh-pages
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.quarto/
index_files/
index.html
Empty file added .nojekyll
Empty file.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Machine Learning Workflow Orchestration
3 changes: 3 additions & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
project:
render:
- index.qmd
20 changes: 20 additions & 0 deletions code/flyte_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from flytekit import task, workflow


# Define a task that produces the string "Hello, World!"
# by using the `@task` decorator to annotate the Python function
@task
def say_hello() -> str:
return "Hello, World!"


# Handle the output of a task like that of a regular Python function.
@workflow
def hello_world_wf() -> str:
res = say_hello()
return res


# Run the workflow locally by calling it like a Python function
if __name__ == "__main__":
print(f"Running hello_world_wf() {hello_world_wf()}")
Binary file added images/HITS_RGB_eng.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/ai_workflow1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/ai_workflow2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/flyte-ui_mnist-workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/streamflow-model.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/union_tasks.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
101 changes: 101 additions & 0 deletions index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
title: Machine Learning Workflow Orchestration
subtitle: Flyte & StreamFlow
author: Bernd Doser (HITS)
date: 2024/10/07
date-format: "MMMM YYYY"
institute: "[HITS gGmbH](https://h-its.org)"
format:
revealjs:
logo: images/HITS_RGB_eng.jpg
footer: "ML Workflow Orchestration (Bernd Doser, HITS)"
slide-number: true
highlight-style: a11y
width: 1300
---

## General Requirements

- Easy deployment of the workflow infrastructure
- Flexible pipeline definition
- Monitoring status, execution times, and results
- Fast pipeline for big data (S3 / Filesystem)
- Integration of Containers
- Integration of HPC clusters (SLURM / Apptainers)


## Example Workflow

![](images/flyte-ui_mnist-workflow.png)


## Flyte vs StreamFlow

:::: {.columns}

::: {.column width="50%"}
### Flyte
- Kubernetes and S3 storage
+ Flexible workflow using python function decorators `@task` and `@workflow`
- Add HPC resources to Kubernetes

**Internal use**
:::

::: {.column width="50%"}
### StreamFlow
+ Common workflow language
+ No special environment needed
+ SLURM plugin available
+ Developed by Alpha UniTO (SPACE partner)

**Workflow Distribution**
:::

::::

## Flyte

- [Flyte](https://flyte.org/) evaluation setup
- [k3s](https://k3s.io/): Lightweight Kubernetes
- [MinIO](https://min.io/): High-performance, S3 compatible object store
- Parallelization: Automatically managed by task dependencies
- Versioning, caching, dynamic workflows available
- Integration of HPC resources
- Custom Flyte agent using SLURM Rest API (not available yet)
- Kubernetes agent: GPU node with 4x NVIDIA A40 cards


## Spherinator Workflow with Flyte

```python
{{< include code/flyte_example.py >}}
```


## StreamFlow

- [Common Workflow Language (CWL)](https://www.commonwl.org/) as open standard
- [BioExcel CoE Building Blocks](https://bioexcel.eu/biobb-new/) is using CWL for interoperable and reproducible biomolecular simulation workflows.
- StreamFlow connects CWL with HPC
- Supports SLURM, Singularity and Containers
- [Web service for visualization](https://view.commonwl.org/)

<!-- - VSCode extension: [benten-cwl](https://marketplace.visualstudio.com/items?itemName=sbg-rabix.benten-cwl) -->


## StreamFlow Architecture

![](images/streamflow-model.png){height="500"}


## Supported Features

- [Parallelization](https://www.commonwl.org/features/#parallelization-and-scale-with-cwl)
- [Caching](https://www.commonwl.org/user_guide/topics/troubleshooting.html#run-cwltool-with-cachedir)
(vs [Flyte caching](https://docs.flyte.org/en/latest/user_guide/development_lifecycle/caching.html#caching))
- Nested workflows
- Looping / scattering tasks
- Conditional workflows
- VSCode extension: [benten-cwl](https://marketplace.visualstudio.com/items?itemName=sbg-rabix.benten-cwl)
- [Web service for visualization](https://view.commonwl.org/) -->
3 changes: 3 additions & 0 deletions notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Containerizing your project

https://docs.flyte.org/en/latest/flyte_fundamentals/registering_workflows.html#containerizing-your-project

0 comments on commit 3e8876f

Please sign in to comment.