Skip to content

Clustering example #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 13, 2024
Merged

Clustering example #1

merged 2 commits into from
Dec 13, 2024

Conversation

DanInci
Copy link
Contributor

@DanInci DanInci commented Oct 29, 2024

TODO list for example clustering:

  • Create benchmark YAML
  • Implement method modules in custom repositories
  • Update software backend
  • Setup storage
  • Update documentation on website

Note for reviewer:

  • Please review also the example module repositories for clustering, since I wrote some pretty basic analysis. They are available here.

Topology plot:

---
title: clustering_benchmark
---
flowchart LR
	classDef param fill:#f96
	subgraph data
		iris
		penguins
	end
	subgraph distances
		D1
		iris --> D1
		penguins --> D1
	end
	subgraph methods
		kmeans
		D1 --> kmeans
		ward
		D1 --> ward
	end
	subgraph metrics
		ari
		kmeans --> ari
		ward --> ari
		accuracy
		kmeans --> accuracy
		ward --> accuracy
	end
	subgraph params_D1
		8948341830810125333['--measure', 'cosine']
		-6750798201292827140['--measure', 'euclidean']
		-5289070520883795694['--measure', 'manhattan']
		1540997255918402294['--measure', 'chebyshev']
	end
	params_D1:::param --o D1
Loading

Computational plot:
clustering

@DanInci DanInci self-assigned this Oct 29, 2024
@DanInci DanInci requested a review from imallona October 29, 2024 19:57
@retogerber
Copy link
Contributor

Added the possibility to use the remote storage with play.min.io
use with (currently only works with https://github.com/omnibenchmark/omnibenchmark/tree/io_tests):
OB_STORAGE_S3_CONFIG=.play_minio.json ob run benchmark -b Clustering.yaml

@imallona
Copy link
Member

Cross-posting omnibenchmark/omnibenchmark#35

@imallona
Copy link
Member

imallona commented Nov 1, 2024

The omnibenchmark/omnibenchmark#36 patches the run using envmodules, e.g. when snakemake -s Snakefile --use-envmodules once the Snakefile is generated after running ob run benchmark -b Clustering.yaml --local (see note) with omnibenchmark/omnibenchmark@e6dab70 and Clustering.yaml f636fe5. Currently, the validator checks whether envmodules as files exist or not. There are (lua) files defining the envmodules somewhere, but I think is easier we check via module avail checks (see the drafted PR).

Note: The ob run benchmark using envmodules does not work directly but that's an independent issue, probably due to the modulepath environmental variable not being propagated to children processes by snakemake; opening a PR.

@DanInci
Copy link
Contributor Author

DanInci commented Nov 4, 2024

Added the possibility to use the remote storage with play.min.io use with (currently only works with https://github.com/omnibenchmark/omnibenchmark/tree/io_tests): OB_STORAGE_S3_CONFIG=.play_minio.json ob run benchmark -b Clustering.yaml

@retogerber Arent these sensitive tokens, or just part of a free sandbox environment?
Would it make sense to store them as github secrets and only use remote storage examples with the CI jobs?

@retogerber
Copy link
Contributor

Added the possibility to use the remote storage with play.min.io use with (currently only works with https://github.com/omnibenchmark/omnibenchmark/tree/io_tests): OB_STORAGE_S3_CONFIG=.play_minio.json ob run benchmark -b Clustering.yaml

@retogerber Arent these sensitive tokens, or just part of a free sandbox environment? Would it make sense to store them as github secrets and only use remote storage examples with the CI jobs?

@DanInci those are not sensitive but from the official minio sandbox. E.g. here: https://min.io/docs/minio/linux/developers/python/minio-py.html

@DanInci
Copy link
Contributor Author

DanInci commented Nov 5, 2024

Added the possibility to use the remote storage with play.min.io use with (currently only works with https://github.com/omnibenchmark/omnibenchmark/tree/io_tests): OB_STORAGE_S3_CONFIG=.play_minio.json ob run benchmark -b Clustering.yaml

@retogerber Arent these sensitive tokens, or just part of a free sandbox environment? Would it make sense to store them as github secrets and only use remote storage examples with the CI jobs?

@DanInci those are not sensitive but from the official minio sandbox. E.g. here: https://min.io/docs/minio/linux/developers/python/minio-py.html

Alright great, just double checking.

@imallona imallona merged commit d3c392c into main Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants