Merge pull request #71 from automl/master

Get development up to date with master
automl · Jan 22, 2024 · 8223638 · 8223638
2 parents 1ff61a0 + 9ed23f6
commit 8223638
Show file tree

Hide file tree

Showing 7 changed files with 151 additions and 148 deletions.
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -22,10 +22,21 @@ jobs:
       - uses: actions/checkout@v3
       - uses: actions/setup-python@v4
         with:
-          python-version: 3.x
+          python-version: 3.10.x
       - uses: actions/cache@v3
         with:
           key: ${{ github.ref }}
           path: .cache
-      - run: pip install "mkdocs-material" "mkdocstrings[python]"
-      - run: mkdocs gh-deploy --force
+      - uses: SebRollen/[email protected]
+        id: read_toml
+        with:
+          file: 'pyproject.toml'
+          field: 'project.version'
+      - run: pip install ".[dev]"
+      - name: Configure Git user
+        run: |
+          git config --local user.email "github-actions[bot]@users.noreply.github.com"
+          git config --local user.name "github-actions[bot]"
+      - run: git fetch origin gh-pages --depth=1
+      - run: mike deploy --push --update-aliases ${{ steps.read_toml.outputs.value }} latest
+      - run: mike set-default --push latest
diff --git a/README.md b/README.md
@@ -5,6 +5,7 @@
 [![Coverage Status](https://coveralls.io/repos/github/automl/DEHB/badge.svg)](https://coveralls.io/github/automl/DEHB)
 [![PyPI](https://img.shields.io/pypi/v/dehb)](https://pypi.org/project/dehb/)
 [![Static Badge](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11%20-blue)](https://pypi.org/project/dehb/)
+[![arXiv](https://img.shields.io/badge/arXiv-2105.09821-b31b1b.svg)](https://arxiv.org/abs/2105.09821)
 ### Installation
 ```bash
 # from pypi
@@ -142,10 +143,3 @@ represents the *mutation* strategy while `bin` represents the *binomial crossove
   editor    = {Z. Zhou},
   year      = {2021}
 }
-
-@online{Awad-arXiv-2023,
-title       = {MO-DEHB: Evolutionary-based Hyperband for Multi-Objective Optimization},
-author      = {Noor Awad and Ayushi Sharma and Frank Hutter},
-year        = {2023},
-keywords    = {}
-}
diff --git a/docs/getting_started/parallel.md b/docs/getting_started/parallel.md
@@ -0,0 +1,72 @@
+### Running DEHB in a parallel setting
+
+DEHB has been designed to interface a [Dask client](https://distributed.dask.org/en/latest/api.html#distributed.Client).
+DEHB can either create a Dask client during instantiation and close/kill the client during garbage collection. 
+Or a client can be passed as an argument during instantiation.
+
+* Setting `n_workers` during instantiation \
+    If set to `1` (default) then the entire process is a sequential run without invoking Dask. \
+    If set to `>1` then a Dask Client is initialized with as many workers as `n_workers`. \
+    This parameter is ignored if `client` is not None.
+* Setting `client` during instantiation \
+    When `None` (default), a Dask client is created using `n_workers` specified. \
+    Else, any custom-configured Dask Client can be created and passed as the `client` argument to DEHB.
+
+#### Using GPUs in a parallel run
+
+Certain target function evaluations (especially for Deep Learning) require computations to be 
+carried out on GPUs. The GPU devices are often ordered by device ID and if not configured, all 
+spawned worker processes access these devices in the same order and can either run out of memory or
+not exhibit parallelism.
+
+For `n_workers>1` and when running on a single node (or local), the `single_node_with_gpus` can be 
+passed to the `run()` call to DEHB. Setting it to `False` (default) has no effect on the default setup 
+of the machine. Setting it to `True` will reorder the GPU device IDs dynamically by setting the environment 
+variable `CUDA_VISIBLE_DEVICES` for each worker process executing a target function evaluation. The re-ordering 
+is done in a manner that the first priority device is the one with the least number of active jobs assigned 
+to it by that DEHB run.
+
+To run the PyTorch MNIST example on a single node using 2 workers:  
+```bash
+python examples/03_pytorch_mnist_hpo.py \
+    --min_budget 1 \
+    --max_budget 3 \
+    --runtime 60 \
+    --n_workers 2 \
+    --single_node_with_gpus \
+    --verbose
+```
+
+#### Multi-node runs
+
+Multi-node parallelism is often contingent on the cluster setup to be deployed on. Dask provides useful 
+frameworks to interface various cluster designs. As long as the `client` passed to DEHB during 
+instantiation is of type `dask.distributed.Client`, DEHB can interact with this client and 
+distribute its optimization process in a parallel manner. 
+
+For instance, `Dask-CLI` can be used to create a `dask-scheduler` which can dump its connection 
+details to a file on a cluster node accessible to all processes. Multiple `dask-worker` can then be
+created to interface the `dask-scheduler` by connecting to the details read from the file dumped. Each
+dask-worker can be triggered on any remote machine. Each worker can be configured as required, 
+including mapping to specific GPU devices. 
+
+Some helper scripts can be found [here](https://github.com/automl/DEHB/tree/master/utils), that can be used as a reference to run DEHB in a multi-node 
+manner on clusters managed by SLURM. (*not expected to work off-the-shelf*)
+
+To run the PyTorch MNIST example on a multi-node setup using 4 workers:
+```bash
+bash utils/run_dask_setup.sh \
+    -n 4 \
+    -f dask_dump/scheduler.json \   # This is how the workers will be discovered by DEHB
+    -e env_name
+
+# Make sure to sleep to allow the workers to setup properly
+sleep 5
+
+python examples/03_pytorch_mnist_hpo.py \
+    --min_budget 1 \
+    --max_budget 3 \
+    --runtime 60 \
+    --scheduler_file dask_dump/scheduler.json \
+    --verbose
+```
diff --git a/docs/getting_started/single_worker.md b/docs/getting_started/single_worker.md
@@ -0,0 +1,52 @@
+### Basic single worker setup
+A basic setup for optimizing can be done as follows. Please note, that this is example should solely show a simple setup of `dehb`. More in-depth examples can be found in the [examples folder](https://github.com/automl/DEHB/tree/master/examples). First we need to setup a `ConfigurationSpace`, from which Configurations will be sampled:
+
+```python exec="true" source="material-block" result="python" title="Configuration Space" session="someid"
+from ConfigSpace import ConfigurationSpace, Configuration
+
+cs = ConfigurationSpace({"x0": (3.0, 10.0), "x1": ["red", "green"]})
+print(cs)
+```
+
+Next, we need an `object_function`, which we are aiming to optimize:
+```python exec="true" source="material-block" result="python" title="Configuration Space" session="someid"
+import numpy as np
+
+def objective_function(x: Configuration, budget: float, **kwargs):
+    # Replace this with your actual objective value (y) and cost.
+    cost = (10 if x["x1"] == "red" else 100) + budget
+    y = x["x0"] + np.random.uniform()
+    return {"fitness": y, "cost": x["x0"]}
+
+sample_config = cs.sample_configuration()
+print(sample_config)
+
+result = objective_function(sample_config, budget=10)
+print(result)
+```
+
+Finally, we can setup our optimizer and run DEHB:
+
+```python exec="true" source="material-block" result="python" title="Configuration Space" session="someid"
+from dehb import DEHB
+
+dim = len(cs.get_hyperparameters())
+optimizer = DEHB(
+    f=objective_function,
+    cs=cs,
+    dimensions=dim,
+    min_budget=3,
+    max_budget=27,
+    eta=3,
+    n_workers=1,
+    output_path="./logs",
+)
+
+# Run optimization for 1 bracket. Output files will be saved to ./logs
+traj, runtime, history = optimizer.run(brackets=1, verbose=True)
+config, fitness, runtime, budget, _ = history[0]
+print("config", config)
+print("fitness", fitness)
+print("runtime", runtime)
+print("budget", budget)
+```
diff --git a/docs/index.md b/docs/index.md
@@ -4,7 +4,7 @@
 
 `dehb` is a python package implementing the [DEHB](https://arxiv.org/abs/2105.09821) algorithm. It offers an intuitive interface to optimize user-defined problems using DEHB.
 
-This documentation explains how to use `dehb` and demonstrates its features. In the following section you will be guided how to install the `dehb` package and how to use it in your own projects. Examples with more hands-on material can be found in the [examples folder](../examples/).
+This documentation explains how to use `dehb` and demonstrates its features. In the following section you will be guided how to install the `dehb` package and how to use it in your own projects. Examples with more hands-on material can be found in the [examples folder](https://github.com/automl/DEHB/tree/master/examples).
 
 ## Installation
 
@@ -24,135 +24,8 @@ pip install dehb
     pip install -e DEHB  # -e stands for editable, lets you modify the code and rerun things
     ```
 
-## Getting Started
-
-In the following sections we provide some basic examplatory setup for running DEHB with a single worker or in a multi-worker setup.
-
-### Basic single worker setup
-A basic setup for optimizing can be done as follows. Please note, that this is example should solely show a simple setup of `dehb`. More in-depth examples can be found in the [examples folder](../examples/). First we need to setup a `ConfigurationSpace`, from which Configurations will be sampled:
-
-```python exec="true" source="material-block" result="python" title="Configuration Space" session="someid"
-from ConfigSpace import ConfigurationSpace, Configuration
-
-cs = ConfigurationSpace({"x0": (3.0, 10.0), "x1": ["red", "green"]})
-print(cs)
-```
-
-Next, we need an `object_function`, which we are aiming to optimize:
-```python exec="true" source="material-block" result="python" title="Configuration Space" session="someid"
-import numpy as np
-
-def objective_function(x: Configuration, fidelity: float, **kwargs):
-    # Replace this with your actual objective value (y) and cost.
-    cost = (10 if x["x1"] == "red" else 100) + fidelity
-    y = x["x0"] + np.random.uniform()
-    return {"fitness": y, "cost": x["x0"]}
-
-sample_config = cs.sample_configuration()
-print(sample_config)
-
-result = objective_function(sample_config, fidelity=10)
-print(result)
-```
-
-Finally, we can setup our optimizer and run DEHB:
-
-```python exec="true" source="material-block" result="python" title="Configuration Space" session="someid"
-from dehb import DEHB
-
-dim = len(cs.get_hyperparameters())
-optimizer = DEHB(
-    f=objective_function,
-    cs=cs,
-    dimensions=dim,
-    min_fidelity=3,
-    max_fidelity=27,
-    eta=3,
-    n_workers=1,
-    output_path="./logs",
-)
-
-# Run optimization for 1 bracket. Output files will be saved to ./logs
-traj, runtime, history = optimizer.run(brackets=1, verbose=True)
-config, fitness, runtime, fidelity, _ = history[0]
-print("config", config)
-print("fitness", fitness)
-print("runtime", runtime)
-print("fidelity", fidelity)
-```
-
-### Running DEHB in a parallel setting
-
-DEHB has been designed to interface a [Dask client](https://distributed.dask.org/en/latest/api.html#distributed.Client).
-DEHB can either create a Dask client during instantiation and close/kill the client during garbage collection. 
-Or a client can be passed as an argument during instantiation.
-
-* Setting `n_workers` during instantiation \
-    If set to `1` (default) then the entire process is a sequential run without invoking Dask. \
-    If set to `>1` then a Dask Client is initialized with as many workers as `n_workers`. \
-    This parameter is ignored if `client` is not None.
-* Setting `client` during instantiation \
-    When `None` (default), a Dask client is created using `n_workers` specified. \
-    Else, any custom-configured Dask Client can be created and passed as the `client` argument to DEHB.
-
-#### Using GPUs in a parallel run
-
-Certain target function evaluations (especially for Deep Learning) require computations to be 
-carried out on GPUs. The GPU devices are often ordered by device ID and if not configured, all 
-spawned worker processes access these devices in the same order and can either run out of memory or
-not exhibit parallelism.
-
-For `n_workers>1` and when running on a single node (or local), the `single_node_with_gpus` can be 
-passed to the `run()` call to DEHB. Setting it to `False` (default) has no effect on the default setup 
-of the machine. Setting it to `True` will reorder the GPU device IDs dynamically by setting the environment 
-variable `CUDA_VISIBLE_DEVICES` for each worker process executing a target function evaluation. The re-ordering 
-is done in a manner that the first priority device is the one with the least number of active jobs assigned 
-to it by that DEHB run.
-
-To run the PyTorch MNIST example on a single node using 2 workers:  
-```bash
-python examples/03_pytorch_mnist_hpo.py \
-    --min_fidelity 1 \
-    --max_fidelity 3 \
-    --runtime 60 \
-    --n_workers 2 \
-    --single_node_with_gpus \
-    --verbose
-```
-
-#### Multi-node runs
-
-Multi-node parallelism is often contingent on the cluster setup to be deployed on. Dask provides useful 
-frameworks to interface various cluster designs. As long as the `client` passed to DEHB during 
-instantiation is of type `dask.distributed.Client`, DEHB can interact with this client and 
-distribute its optimization process in a parallel manner. 
-
-For instance, `Dask-CLI` can be used to create a `dask-scheduler` which can dump its connection 
-details to a file on a cluster node accessible to all processes. Multiple `dask-worker` can then be
-created to interface the `dask-scheduler` by connecting to the details read from the file dumped. Each
-dask-worker can be triggered on any remote machine. Each worker can be configured as required, 
-including mapping to specific GPU devices. 
-
-Some helper scripts can be found [here](../utils/), that can be used as a reference to run DEHB in a multi-node 
-manner on clusters managed by SLURM. (*not expected to work off-the-shelf*)
-
-To run the PyTorch MNIST example on a multi-node setup using 4 workers:
-```bash
-bash utils/run_dask_setup.sh \
-    -n 4 \
-    -f dask_dump/scheduler.json \   # This is how the workers will be discovered by DEHB
-    -e env_name
-
-# Make sure to sleep to allow the workers to setup properly
-sleep 5
-
-python examples/03_pytorch_mnist_hpo.py \
-    --min_fidelity 1 \
-    --max_fidelity 3 \
-    --runtime 60 \
-    --scheduler_file dask_dump/scheduler.json \
-    --verbose
-```
+## Contributing
+Please have a look at our [contributing guidelines](https://github.com/automl/DEHB/blob/master/CONTRIBUTING.md).
 
 ## To cite the paper or code
 If you use DEHB in one of your research projects, please cite our paper(s):
@@ -167,11 +40,4 @@ If you use DEHB in one of your research projects, please cite our paper(s):
   editor    = {Z. Zhou},
   year      = {2021}
 }
-
-@online{Awad-arXiv-2023,
-    title       = {MO-DEHB: Evolutionary-based Hyperband for Multi-Objective Optimization},
-    author      = {Noor Awad and Ayushi Sharma and Frank Hutter},
-    year        = {2023},
-    keywords    = {}
-}
 ```
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -2,6 +2,9 @@ site_name: DEHB
 
 nav:
   - Home: index.md
+  - Getting Started:
+    - Single Worker: getting_started/single_worker.md
+    - Parallel: getting_started/parallel.md
   - Code Reference:
     - DEHB: references/dehb.md
     - DE: references/de.md
@@ -96,4 +99,8 @@ plugins:
             members_order: "source"
             show_signature: true
             separate_signature: false
-            show_signature_annotations: false
+            show_signature_annotations: false
+
+extra:
+  version:
+    provider: mike
diff --git a/pyproject.toml b/pyproject.toml
@@ -46,6 +46,7 @@ dev = [
   "mkdocs-material",
   "mkdocstrings[python]",
   "markdown-exec[ansi]",
+  "mike",
   # Others
   "ruff",
   "black",