SpaceCast is a repository for graph-based neural space weather forecasting. The code uses PyTorch Lightning for modeling, and Weights & Biases for logging. The code is based on Neural-LAM and uses MDP for data prep, which lowers the bar to adapt progress in limited area modeling for space weather.
The repository contains LAM versions of:
- The graph-based model from Keisler (2022).
- GraphCast, by Lam et al. (2023).
- The hierarchical model from Oskarsson et al. (2024).
Use Python 3.10 / 3.11 and
torch==2.5.1
pytorch-lightning==2.4.0
torch_geometric==2.6.1
mllam-data-prep==0.6.1
Complete list of packages can be installed with pip install -r requirements.txt
.
To create a training-ready dataset with mllam-data-prep, run:
mllam_data_prep data/vlasiator_mdp.yaml
Simple, multiscale, and hierarchical graphs are created and stored in .pt
format using the following commands:
python -m neural_lam.create_graph --config_path data/vlasiator_config.yaml --name simple --levels 1 --plot
python -m neural_lam.create_graph --config_path data/vlasiator_config.yaml --name multiscale --levels 3 --plot
python -m neural_lam.create_graph --config_path data/vlasiator_config.yaml --name hierarchical --hierarchical --levels 3 --plot
To plot the graphs and store as .html
files run:
python -m neural_lam.plot_graph --datastore_config_path data/vlasiator_config.yaml --graph ...
with --graph
as simple
, multiscale
or hierarchcial
and --save
is the name of the output file.
If you'd like to login and use W&B, run:
wandb login
If you prefer to just log things locally, run:
wandb off
See docs for more details.
The first stage of a probabilistic model can be trained something like this (where in later stages you add kl_beta
and crps_weight
):
python -m neural_lam.train_model \
--config_path data/vlasiator_config.yaml \
--num_workers 2 \
--precision bf16-mixed \
--model graph_efm \
--graph multiscale \
--hidden_dim 64 \
--processor_layers 4 \
--ensemble_size 5 \
--batch_size 1 \
--lr 0.001 \
--kl_beta 0 \
--crps_weight 0 \
--ar_steps_train 1 \
--epochs 500 \
--val_interval 50 \
--ar_steps_eval 4 \
--val_steps_to_log 1 2 3
Distributed data parallel training is supported. Specify number of nodes with the --node
argument. For a full list of training options see python neural_lam.train_model --help
.
Inference uses the same script as training, with the same choice of parameters, and some to have an extra look at like --eval test
, --ar_steps_eval 30
and --n_example_pred 1
to evaluate 30 second forecasts on the test set with 1 example forecast plotted.
python -m neural_lam.train_model \
--config_path data/vlasiator_config.yaml \
--model graph_efm \
--graph hierarchical \
--num_nodes 1 \
--num_workers 2 \
--batch_size 1 \
--hidden_dim 64 \
--processor_layers 2 \
--ensemble_size 5 \
--ar_steps_eval 30 \
--precision bf16-mixed \
--n_example_pred 1 \
--eval test \
--load ckpt_path
where a model checkpoint from a given path given to the --load
in .ckpt
format.