Skip to content

Commit

Permalink
add storage and streamflow example
Browse files Browse the repository at this point in the history
  • Loading branch information
BerndDoser committed Sep 18, 2024
1 parent d708edb commit 7f2ae48
Show file tree
Hide file tree
Showing 4 changed files with 58 additions and 12 deletions.
12 changes: 3 additions & 9 deletions code/flyte_example.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,14 @@
from flytekit import task, workflow


# Define a task that produces the string "Hello, World!"
# by using the `@task` decorator to annotate the Python function
@task
def say_hello() -> str:
return "Hello, World!"


# Handle the output of a task like that of a regular Python function.
@workflow
def hello_world_wf() -> str:
def wf() -> str:
res = say_hello()
return res


# Run the workflow locally by calling it like a Python function
# Local execution
if __name__ == "__main__":
print(f"Running hello_world_wf() {hello_world_wf()}")
print(f"Running wf() {wf()}")
17 changes: 17 additions & 0 deletions code/streamflow_example.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
cwlVersion: v1.2

# What type of CWL process we have in this document.
class: CommandLineTool
# This CommandLineTool executes the linux "echo" command-line tool.
baseCommand: echo

# The inputs for this process.
inputs:
message:
type: string
# A default value that can be overridden, e.g. --message "Hola mundo"
default: "Hello World"
# Bind this message value as an argument to "echo".
inputBinding:
position: 1
outputs: []
36 changes: 33 additions & 3 deletions index.qmd
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: Machine Learning Workflow Orchestration
subtitle: Flyte & StreamFlow
author: Bernd Doser (HITS)
author: Bernd Doser
date: 2024/10/07
date-format: "MMMM YYYY"
institute: "[HITS gGmbH](https://h-its.org)"
institute: "[HITS](https://h-its.org)"
format:
revealjs:
logo: images/HITS_RGB_eng.jpg
Expand Down Expand Up @@ -48,6 +48,29 @@ format:
![](images/flyte-ui_mnist-workflow.png)


## File, Block, and Object Storage

- Differ between storage type and storage access type (file, block, object)
- Throughput vs Latency
- Traditional: object storage is throughput optimized whereas file/block storage is latency optimized
- Object storage uses RESTful API (cloud-native)
- File uses POSIX, Block uses FC/SCSI

## Block Storage

- Data is divided in uniformly sized blocks
- Blocks ca stored across different storage environments
- Blocks do not contain any information about the content
- local SSD

## File Storage

- Hierarchical structure in files and directories
- NFS

## Feed the Beast


## Flyte vs StreamFlow

:::: {.columns}
Expand Down Expand Up @@ -80,7 +103,7 @@ format:
- Kubernetes agent: GPU node with 4x NVIDIA A40 cards


## Spherinator Workflow with Flyte
## Flyte: Example

```python
{{< include code/flyte_example.py >}}
Expand Down Expand Up @@ -113,3 +136,10 @@ format:
- Conditional workflows
- VSCode extension: [benten-cwl](https://marketplace.visualstudio.com/items?itemName=sbg-rabix.benten-cwl)
- [Web service for visualization](https://view.commonwl.org/) -->


## StreamFlow: Example

```yaml
{{< include code/streamflow_example.cwl >}}
```
5 changes: 5 additions & 0 deletions notes.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
## Storage

https://cloud.google.com/architecture/ai-ml/storage-for-ai-ml?hl=de#storage-options


## Containerizing your project

https://docs.flyte.org/en/latest/flyte_fundamentals/registering_workflows.html#containerizing-your-project

0 comments on commit 7f2ae48

Please sign in to comment.