Diffusion- and Flow-Matching-based models for generating synthetic landing trajectories in aviation

Code used for all models and experiments in the Master's thesis of Olav Finne Præsteng Larsen, NTNU/SINTEF 2025.

Installation

The project contains a pyproject.toml file that specifies the dependencies and the build system. The project uses poetry for dependency management and packaging. To install the SynTraj package locally, clone the repository and install the dependencies using poetry:

git clone [email protected]:SynthAIr/DM_FM_Trajectories.git
cd DM_FM_Trajectories
poetry install
cd src

Project Structure

The project is organized into the following main directories:

configs: Contains configuration files for different models and datasets.
data: Contains the raw and processed data files used for training and evaluation.
evaluation: Metrics and scripts for evaluating the models.
models: Contains the implementation of the models used for trajectory generation.
utils: Utility functions and classes used throughout the project.

Train

The train.py script is used to train a flight trajectory generation model. The models are saved in a folder with the model name. The folder contains the best model and the model configuration file.

Arguments

Argument	Type	Default	Description
`--config_file`	`str`	`./configs/config.yaml`	Path to the main configuration file for the model config.
`--dataset_config`	`str`	`./configs/dataset_landing_transfer.yaml`	Path to the dataset-specific configuration file.
`--data_path`	`str`	`./data/resampled/combined_traffic_resampled_200.pkl`	Path to the training dataset file.
`--artifact_path` / `--artifact_location`	`str`	`./artifacts`	Directory where training artifacts will be saved.
`--cuda`	`int`	`0`	GPU index to use for training.

Example

python train.py \
  --config_file ./configs/config.yaml \
  --dataset_config ./configs/dataset_landing_transfer.yaml \
  --data_path ./data/resampled/combined_traffic_resampled_200.pkl \
  --artifact_path ./artifacts \
  --cuda 0

Evaluate

The evaluate.py script is used to evaluate the flight trajectory generation model.

Arguments

Argument	Type	Default	Description
`--model_name`	`str`	`AirDiffTraj`	Name of the model to run.
`--data_path`	`str`	`./data/resampled/combined_traffic_resampled_200.pkl`	Path to the input data file.
`--dataset_config`	`str`	`./configs/dataset_landing_transfer.yaml`	Path to the dataset configuration file.
`--artifact_path` / `--artifact_location`	`str`	`./artifacts`	Directory to store artifacts such as outputs or logs.
`--cuda`	`int`	`0`	GPU index to use.
`--perturb`	`flag`	`False`	Run Evaluate for perturbation model.

Example

Run the model in evaluation on GPU 0:

python evaluate.py \
  --model_name AirDiffTraj \
  --data_path ./data/resampled/combined_traffic_resampled_200.pkl \
  --dataset_config ./configs/dataset_landing_transfer.yaml \
  --artifact_path ./artifacts \
  --cuda 0 \

Training and Evaluation

The train_evaluate_pipeline.py script combines training and evaluation for the flight trajectory generation model.

Arguments

Argument	Type	Default	Description
`--config_file`	`str`	`./configs/config.yaml`	Path to the main training configuration file.
`--artifact_path` / `--artifact_location`	`str`	`./artifacts`	Directory to save artifacts (models, logs, etc.).
`--data_path`	`str`	`./data/resampled/combined_traffic_resampled_200.pkl`	Path to the input data file.
`--dataset_config`	`str`	`./configs/dataset_landing_transfer.yaml`	Path to the dataset configuration file.
`--model_name`	`str`	`None`	Optional name for the model (e.g., `AirDiffTraj`).
`--perturb`	`flag`	`False`	Apply perturbations to input data for robustness testing.
`--cuda`	`int`	`0`	GPU index to use.

Example

Train and evaluate a model on GPU 0:

python train_evaluate_pipeline.py \
  --config_file ./configs/config.yaml \
  --data_path ./data/resampled/combined_traffic_resampled_200.pkl \
  --artifact_path ./artifacts \
  --model_name AirDiffTraj \
  --dataset_config ./configs/dataset_landing_transfer.yaml \
  --cuda 0

Transfer Learning

🔁 Transfer Learning Pipeline

The pipeline_transfer_learning.py script is designed for transfer learning experiments using the flight trajectory generation model. It allows running training, single evaluation, or batch evaluations across a dataset split. The script supports GPU configuration. It trains and evaluates the pretrained model on the different datasplits.

Arguments

Argument	Type	Default	Description
`--model_name`	`str`	`AirDiffTraj`	Name of the model to use.
`--data_path`	`str`	`./data/resampled/combined_traffic_resampled_landing_EIDW_200.pkl`	Path to the dataset used for training/evaluation.
`--dataset_config`	`str`	`./configs/dataset_landing_transfer.yaml`	Configuration file for the dataset.
`--artifact_location`	`str`	`./data/experiments/transfer_learning_EIDW/artifacts`	Directory where experiment artifacts will be saved.
`--experiment`	`str`	(required)	Name or identifier of the experiment to run.
`--cuda`	`int`	`0`	GPU index to use.
`--eval`	`flag`	`False`	Run a single evaluation using the specified model and data.
`--eval_all`	`flag`	`False`	Run evaluations over all predefined data splits.

Note: The data is automatically split using the following proportions: [0.0, 0.05, 0.2, 0.5, 1.0]

Example

Run a transfer learning experiment using GPU 0:

python pipeline_transfer_learning.py \
  --model_name AirDiffTraj_5 \
  --data_path ./data/resampled/combined_traffic_resampled_landing_EIDW_200.pkl \
  --dataset_config ./configs/dataset_landing_transfer.yaml \
  --artifact_location ./data/experiments/transfer_learning_EIDW/artifacts \
  --experiment my_experiment \
  --cuda 0


### Train Optuna
Script to optimize the hyperparameters of the model using Optuna. The script uses the `optuna` library to perform hyperparameter optimization. The script is designed to be run from the command line.

### Example
Run Optuna for hyperparameter optimization on GPU 0:

```bash
python train_optuna.py

Preprocess

The preprocess.py script is used to prepare raw air traffic data for model training. It supports different data sources (e.g., OpenSky, Eurocontrol) and includes a mode for preprocessing landing trajectories specifically.

Arguments

Argument	Type	Default	Description
`--ADEP`	`str`	`EHAM`	Departure airport ICAO code.
`--ADES`	`str`	`LSZH`	Arrival airport ICAO code.
`--data_dir`	`str`	`./data/Opensky`	Directory containing the raw input data.
`--data_source`	`str`	`OpenSky`	Data source identifier. Either `"OpenSky"` or `"Eurocontrol"` by default.
`--landing`	`flag`	`False`	Preprocess the data for landing trajectories only (overrides data source to `"landing"`).

Example

Preprocess a standard origin-destination dataset from OpenSky:

python preprocess.py \
  --ADES LSZH \
  --data_dir ./data/landing \
  --data_source OpenSky \
  --landing

Merge Datasets

The script is used to process, interpolate, and combine all air traffic trajectory data into a single .pkl file for further use in training or evaluation. It resamples trajectories to a uniform length and supports configurations for specific airports or datasets.

Arguments

Argument	Type	Default	Description
`--length`	`int`	`200`	Target length for resampling all trajectories.
`--data_dir`	`str`	`./data`	Directory containing input raw data files.

⚠️ Note: The --data_source argument is overwritten by the script and is not user-configurable from the command line. Output is saved to:
./data/combined_traffic_resampled_landing_200.pkl

Example

Run the resampling and combining process with a trajectory length of 200:

python resample_combine.py \
  --length 200 \
  --data_dir ./data

Name		Name	Last commit message	Last commit date
Latest commit History 381 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Diffusion- and Flow-Matching-based models for generating synthetic landing trajectories in aviation

Installation

Project Structure

Train

Arguments

Example

Evaluate

Arguments

Example

Training and Evaluation

Arguments

Example

Transfer Learning

🔁 Transfer Learning Pipeline

Arguments

Example

Preprocess

Arguments

Example

Merge Datasets

Arguments

Example

About

Uh oh!

Releases

Packages

Languages

License

SynthAIr/DM_FM_Trajectories

Folders and files

Latest commit

History

Repository files navigation

Diffusion- and Flow-Matching-based models for generating synthetic landing trajectories in aviation

Installation

Project Structure

Train

Arguments

Example

Evaluate

Arguments

Example

Training and Evaluation

Arguments

Example

Transfer Learning

🔁 Transfer Learning Pipeline

Arguments

Example

Preprocess

Arguments

Example

Merge Datasets

Arguments

Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages