Added roadmap for containerization

automl · Nov 19, 2021 · 8ab4d16 · 8ab4d16
1 parent 18a6810
commit 8ab4d16
Show file tree

Hide file tree

Showing 2 changed files with 72 additions and 0 deletions.
diff --git a/dacbench/container/Container Roadmap.md b/dacbench/container/Container Roadmap.md
@@ -0,0 +1,72 @@
+# Container Roadmap
+
+This document describes how we want to use containerization and what needs to be implemented.
+
+There is also a project in the repo called [Containerization](https://github.com/automl/DACBench/projects/2), 
+containing more fine-grained tasks and descriptions. This document serve as a overview of the project. 
+
+## Purpose / Requirements
+We want to use containers, more precisely [Singulariy Container](https://singularity.hpcng.org/), in order to:
+
+1. Make the experiments (more) reproducible: reduce dependency of external tools such as compilers, interpreters and hardware
+2. Easier executable: no need to install everything manually just download DACBench, and it will automatically install the container on request
+3. Ensure same version of dependencies  and DACBench for same experiments: publish container versions for each experiment / publications 
+4. Enable existent of benchmarks with conflicting dependencies: through separate containers
+
+This includes:
+
+* The benchmarks 
+* The baselines
+
+Additional requirements are: 
+* The user should not have to deal with the container directly (except installing the container system)
+* No need for `root` to run the container (rules out Docker)
+* Low overhead
+
+## Architecture
+To fulfill these requirements we adapt the architecture introduced in [HPOBench](https://github.com/automl/HPOBench). 
+
+For questions and support ask: 
+* Philipp Mueller ([email protected])
+* Katharina Eggensperger ([email protected])
+how kindly offered their help.
+
+The main idea is to run the components that have either complicated dependencies or are crucial to be reproducible in a container together with a server that exposes the objects via http / sockets to the outside and provide a wrapper for the objects that automatically retrieves and starts the relevant container and acts as proxy so that the user does not notice she/he is communicating with a component within a container.
+
+![architecture overview](architecture.png)
+
+ Workflow of remote benchmark execution:
+```python
+benchmark = SigmoidBenchmark()
+# adapt default config or load from file
+benchmark.set_seed(42) 
+
+# gets and start container for benchmark version from specific experiment / this also defines what is logged, which wrappers are used 
+# maybe improved / made configurable later
+remote_runner = RemoteRunner(benchmark, experiement_identifier="exp:0.01")
+
+# set up and agent for the baselines we also need a containerized version (todo)
+agent = agent_creation_function(remote_runner.get_environment())
+
+# run the experiment for n episodes
+# logs are written to local file and are retrievable afterwards
+remote_runner.run(agent, number_of_episodes=10)
+```
+
+(todo):
+Classes:
+* `dacbench.container.RemoteRunner` 
+* `dacbench.container.RemoteRunnerServer`
+* `dacbench.container.RemoteEnvironmentClient`
+* `dacbench.container.RemoteEnvironmentServer`
+
+Todos:
+* [ ] Implement container setup and download for benchmarks
+* [ ] Unify the way serialization is handled (currently in the benchmark and in the environment)
+* [ ] Communications via sockets currently via http
+* [ ] set up container registry
+* [ ] Make dependencies separately installable for each benchmark and remove all benchmark dependencies from default since default is to run in container? (or add container extra
+* [ ] command line interface for remote runner / integrate with `dacbench.runner.run()`. Proposed solution: add common baseclass for Runner and RemoteRunner that handles argument parsing and defines interface for method run()
+* [ ] Improve experiment setup (currently only one experiment hardcoded in RemoteRunnerServer.get_environment()))
+* Measure performance of containerized version vs. non-containerized version
+* [ ] Add guide on how to build own containers also useful for internal usage
diff --git a/dacbench/container/architecture.png b/dacbench/container/architecture.png