Skip to content

mines-opt-ml/jfbr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JFBR

Configuration & Trials

Configuration lives under cfg/<mode>/... where <mode> is one of:

  • sweep – hyperparameter exploration (multiple non-seed dimensions allowed; seed must be scalar)
  • experiment – focused evaluation (only the seed may vary; other hyperparameters must be scalar)

Directory pattern for a leaf config:

cfg/<mode>/<dataset>/<model>/<trainer>/cfg.yaml

Each leaf cfg.yaml is merged with its ancestor cfg.yaml files (walking upward until the cfg/ root) to form a base configuration. Merging is shallow: child keys overwrite parent keys.

Expansion writes resolved per‑trial configurations to:

out/log/<mode>/<dataset>/<model>/<trainer>/trial_###/cfg.yaml

Trials are the Cartesian product of list‑valued hyperparameters excluding those on a structural allowlist (currently only batch_metrics). Special handling:

  • seed: list -> becomes a sweep over seeds (each trial gets one scalar value)
  • seed: scalar -> required if no list sweep is desired

Mode rules enforced during expansion:

  1. sweep mode: seed must be scalar; any other list hyperparameters are allowed.
  2. experiment mode: only seed may be list-valued; all other hyperparameters must be scalar (aside from structural allowlist like batch_metrics).

Re‑running expansion is idempotent: resolved trial cfg.yaml files are regenerated (so parent edits propagate). A trial is considered complete when batch_log.csv exists in the same trial_### directory.

Summary:

  1. Hierarchical inheritance (leaf overrides parents).
  2. Lists -> sweep axes (except structural allowlist).
  3. A list-form seed defines a seed axis; else scalar seed required.
  4. Mode-specific constraints (see above) validated at expansion time.
  5. Resolved configs live only under out/log/... and drive all training & analysis.

Programmatic expansion + execution:

from src.run import run_trials
# All modes (sweep + experiment) across all datasets
run_trials()
# Only sweep mode for mnist
run_trials(datasets=["mnist"], modes=["sweep"])

During execution each trial directory accumulates logs (e.g. batch_log.csv). The dashboard and analysis scripts read directly from out/log.

Execution (uv)

This project is managed with uv. Always invoke Python modules and tests via uv run -m so the correct environment and dependency resolution are used.

Examples:

# Run sweep
uv run -m run.sweep

# Run experiment
uv run -m run.experiment

# Run full test suite
uv run -m pytest

Direct python or executing files as scripts is discouraged; prefer the module form above to ensure imports resolve consistently.

Dashboard

The dashboard provides fast, interactive plots of training metrics across trials. Launch it with:

uv run -m run.dashboard

It serves dashboard/index.html and reads logs directly from out/log/.../trial_###/.

  • Epoch aggregation: If epoch_log.csv is missing for a trial, the server synthesizes it on-demand from batch_log.csv with one row per (epoch, mode). Reductions:

    • loss, acc, and other numeric metrics: mean across batches (unweighted)
    • lr, iter_budget: last value within the epoch
    • grad_norm, error_min/med/max, iter_min/med/max: median within the epoch
    • time: end-of-epoch time (max of batch times)
    • n_batches is included for diagnostics
  • Caching & freshness: The generated epoch_log.csv is reused for subsequent views. If batch_log.csv is newer, the server regenerates epoch_log.csv automatically. Writes are atomic to avoid partial files.

The UI prefers epoch_log.csv (small, fast) and falls back to batch_log.csv if needed. No manual preprocessing required.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published