Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions Ironwood/configs/host_device/host_device_single_chip.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
benchmarks:
- benchmark_name: host_device
num_runs: 20
benchmark_sweep_params:
# Single Chip (1 Chip, 2 Devices)
- {mesh_shape: "1x2", data_size_mb_list: [1, 16, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768]}

csv_path: "../microbenchmarks/host_device/single_chip"
trace_dir: "../microbenchmarks/host_device/single_chip/trace"
30 changes: 30 additions & 0 deletions Ironwood/guides/host_device/host_device.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Host Device Microbenchmarks on tpu7x-2x2x1

This guide provides instructions for running Host Device (Host-to-Device and Device-to-Host) microbenchmarks on tpu7x-2x2x1 Google Kubernetes Engine (GKE) clusters. It covers creating a node pool, running the benchmarks, and viewing the output.

> [!NOTE]
> This benchmark is currently a Work In Progress (WIP). Expected bandwidth numbers are not yet finalized.

## Create Node Pools

Follow [Setup section](../../Ironwood_Microbenchmarks_readme.md#setup) to create a GKE cluster with one 2x2x1 nodepool.

## Run Host Device Microbenchmarks

To run the microbenchmarks, apply the following Kubernetes configuration:
```bash
kubectl apply -f tpu7x-host-device-benchmark.yaml
```

To extract the log of the microbenchmark, use `kubectl logs`:
```bash
kubectl logs tpu7x-host-device-benchmark
```

Once the benchmark completes, you should see logs reporting bandwidth statistics.

To retrieve the complete results, including the trace and CSV output files, you must keep the pod running after the benchmark completes. To do this, add a `sleep` command to the `tpu7x-host-device-benchmark.yaml` file. You can then use `kubectl cp` to copy the output from the pod.

```bash
kubectl cp tpu7x-host-device-benchmark:/microbenchmarks/host_device host_device
```
33 changes: 33 additions & 0 deletions Ironwood/guides/host_device/tpu7x-host-device-benchmark.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
apiVersion: v1
kind: Pod
metadata:
name: tpu7x-host-device-benchmark
spec:
restartPolicy: Never
nodeSelector:
cloud.google.com/gke-tpu-accelerator: tpu7x
cloud.google.com/gke-tpu-topology: 2x2x1
containers:
- name: tpu-job
image: python:3.12
ports:
- containerPort: 8431
securityContext:
privileged: false
command:
- bash
- -c
- |
set -ex

git clone https://github.com/AI-Hypercomputer/accelerator-microbenchmarks.git
cd accelerator-microbenchmarks
pip install -r requirements.txt

bash ./Ironwood/scripts/run_host_device_benchmark.sh

resources:
requests:
google.com/tpu: 4
limits:
google.com/tpu: 4
66 changes: 66 additions & 0 deletions Ironwood/scripts/run_host_device_benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#!/bin/bash

# Default values
CONFIG_DIR="Ironwood/configs/host_device"
SPECIFIC_CONFIG=""
INTERLEAVED=false

# Helper function for usage
usage() {
echo "Usage: $0 [OPTIONS]"
echo "Options:"
echo " --config <path> Path to specific config file (optional)"
echo " --interleaved Run with numactl --interleave=all"
echo " --help Show this help message"
exit 1
}

# Parse arguments
while [[ "$#" -gt 0 ]]; do
case $1 in
--config) SPECIFIC_CONFIG="$2"; shift ;;
--interleaved) INTERLEAVED=true ;;
--help) usage ;;
*) echo "Unknown parameter passed: $1"; usage ;;
esac
shift
done

echo "--- Starting Host-Device Transfer Benchmark (H2D/D2H) ---"
echo "Note: This benchmark is work in progress"
echo "Interleaved: $INTERLEAVED"

if [ -n "$SPECIFIC_CONFIG" ]; then
CONFIGS=("$SPECIFIC_CONFIG")
else
# Use nullglob to handle case where no files match (though unlikely here)
shopt -s nullglob
CONFIGS=("$CONFIG_DIR"/*.yaml)
shopt -u nullglob
fi

if [ ${#CONFIGS[@]} -eq 0 ]; then
echo "No configuration files found!"
exit 1
fi

for CONFIG_FILE in "${CONFIGS[@]}"; do
echo "--- Running Config: $CONFIG_FILE ---"
CMD="python Ironwood/src/run_benchmark.py --config=${CONFIG_FILE}"

if [ "$INTERLEAVED" = true ]; then
if command -v numactl &> /dev/null; then
echo "Running with numactl --interleave=all"
numactl --interleave=all $CMD
else
echo "Warning: numactl not found. Running without interleaving."
$CMD
fi
else
$CMD
fi
echo "--- Finished Config: $CONFIG_FILE ---"
echo ""
done

echo "--- All Benchmarks Finished ---"
Loading