Skip to content

Commit

Permalink
fix hyperparemeter for benchmark training
Browse files Browse the repository at this point in the history
  • Loading branch information
PatReis committed Feb 5, 2024
1 parent 1e50998 commit ca8063a
Show file tree
Hide file tree
Showing 7 changed files with 239 additions and 18 deletions.
6 changes: 3 additions & 3 deletions training/hyper/hyper_qm9_energies.py
Original file line number Diff line number Diff line change
Expand Up @@ -370,9 +370,9 @@
{"shape": (None, 3), "name": "node_coordinates", "dtype": "float32", "ragged": True},
{"shape": (None, 1), "name": "edge_weights", "dtype": "float32", "ragged": True},
{"shape": (None, 2), "name": "edge_indices", "dtype": "int64", "ragged": True},
{"shape": (None, 2), "name": "range_indices", "dtype": "int64", "ragged": True},
{"shape": [None, 2], "name": "angle_indices_1", "dtype": "int64", "ragged": True},
{"shape": [None, 2], "name": "angle_indices_2", "dtype": "int64", "ragged": True},
{"shape": (None, 2), "name": "range_indices", "dtype": "int64", "ragged": True}
{"shape": [None, 2], "name": "angle_indices_2", "dtype": "int64", "ragged": True}
],
"input_tensor_type": "ragged",
"input_embedding": None,
Expand Down Expand Up @@ -468,7 +468,7 @@
"input_edge_embedding": {"input_dim": 95, "output_dim": 128},
"depth": 7,
"node_mlp_initialize": {"units": 128, "activation": "linear"},
"euclidean_norm_kwargs": {"keepdims": True, "axis": 2, "square_norm": True},
"euclidean_norm_kwargs": {"keepdims": True, "axis": 1, "square_norm": True},
"use_edge_attributes": False,
"edge_mlp_kwargs": {"units": [128, 128], "activation": ["swish", "swish"]},
"edge_attention_kwargs": {"units": 1, "activation": "sigmoid"},
Expand Down
6 changes: 3 additions & 3 deletions training/hyper/hyper_qm9_orbitals.py
Original file line number Diff line number Diff line change
Expand Up @@ -364,9 +364,9 @@
{"shape": (None, 3), "name": "node_coordinates", "dtype": "float32", "ragged": True},
{"shape": (None, 1), "name": "edge_weights", "dtype": "float32", "ragged": True},
{"shape": (None, 2), "name": "edge_indices", "dtype": "int64", "ragged": True},
{"shape": (None, 2), "name": "range_indices", "dtype": "int64", "ragged": True},
{"shape": [None, 2], "name": "angle_indices_1", "dtype": "int64", "ragged": True},
{"shape": [None, 2], "name": "angle_indices_2", "dtype": "int64", "ragged": True},
{"shape": (None, 2), "name": "range_indices", "dtype": "int64", "ragged": True}
{"shape": [None, 2], "name": "angle_indices_2", "dtype": "int64", "ragged": True}
],
"input_tensor_type": "ragged",
"input_embedding": None,
Expand Down Expand Up @@ -460,7 +460,7 @@
"input_edge_embedding": {"input_dim": 95, "output_dim": 128},
"depth": 7,
"node_mlp_initialize": {"units": 128, "activation": "linear"},
"euclidean_norm_kwargs": {"keepdims": True, "axis": 2, "square_norm": True},
"euclidean_norm_kwargs": {"keepdims": True, "axis": 1, "square_norm": True},
"use_edge_attributes": False,
"edge_mlp_kwargs": {"units": [128, 128], "activation": ["swish", "swish"]},
"edge_attention_kwargs": {"units": 1, "activation": "sigmoid"},
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
OS: posix_linux
backend: tensorflow
cuda_available: 'True'
data_unit: ''
date_time: '2024-02-03 06:14:04'
device_id: '[LogicalDevice(name=''/device:CPU:0'', device_type=''CPU''), LogicalDevice(name=''/device:GPU:0'',
device_type=''GPU'')]'
device_memory: '[]'
device_name: '[{}, {''compute_capability'': (7, 0), ''device_name'': ''Tesla V100-SXM2-32GB''}]'
energy_scaled_mean_absolute_error:
- 0.006360400002449751
epochs:
- 1000
execute_folds: null
force_scaled_mean_absolute_error:
- 0.02935820259153843
kgcnn_version: 4.0.0
loss:
- 0.01170190330594778
max_energy_scaled_mean_absolute_error:
- 10.103582382202148
max_force_scaled_mean_absolute_error:
- 12.756863594055176
max_loss:
- 5.149803638458252
max_val_energy_scaled_mean_absolute_error:
- 23.537092208862305
max_val_force_scaled_mean_absolute_error:
- 6.165292739868164
max_val_loss:
- 2.6731419563293457
min_energy_scaled_mean_absolute_error:
- 0.006335231009870768
min_force_scaled_mean_absolute_error:
- 0.02935820259153843
min_loss:
- 0.01170190330594778
min_val_energy_scaled_mean_absolute_error:
- 0.006716604344546795
min_val_force_scaled_mean_absolute_error:
- 0.04271000623703003
min_val_loss:
- 0.01723838970065117
model_class: EnergyForceModel
model_name: PAiNN
model_version: ''
multi_target_indices: null
number_histories: 1
seed: 42
time_list:
- '0:17:39.003485'
trajectory_name: benzene_ccsd_t
val_energy_scaled_mean_absolute_error:
- 0.006726829335093498
val_force_scaled_mean_absolute_error:
- 0.04271000623703003
val_loss:
- 0.01723838970065117
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": {"class_name": "EnergyForceModel", "module_name": "kgcnn.models.force", "config": {"name": "PAiNN", "nested_model_config": true, "output_to_tensor": false, "output_squeeze_states": true, "coordinate_input": 1, "inputs": [{"shape": [null], "name": "atomic_number", "dtype": "int32"}, {"shape": [null, 3], "name": "node_coordinates", "dtype": "float32"}, {"shape": [null, 2], "name": "range_indices", "dtype": "int64"}, {"shape": [], "name": "total_nodes", "dtype": "int64"}, {"shape": [], "name": "total_ranges", "dtype": "int64"}], "model_energy": {"class_name": "make_model", "module_name": "kgcnn.literature.PAiNN", "config": {"name": "PAiNNEnergy", "inputs": [{"shape": [null], "name": "atomic_number", "dtype": "int32"}, {"shape": [null, 3], "name": "node_coordinates", "dtype": "float32"}, {"shape": [null, 2], "name": "range_indices", "dtype": "int64"}, {"shape": [], "name": "total_nodes", "dtype": "int64"}, {"shape": [], "name": "total_ranges", "dtype": "int64"}], "input_embedding": null, "input_node_embedding": {"input_dim": 95, "output_dim": 128}, "equiv_initialize_kwargs": {"dim": 3, "method": "eps"}, "bessel_basis": {"num_radial": 20, "cutoff": 5.0, "envelope_exponent": 5}, "pooling_args": {"pooling_method": "scatter_sum"}, "conv_args": {"units": 128, "cutoff": null}, "update_args": {"units": 128}, "depth": 3, "verbose": 10, "output_embedding": "graph", "output_mlp": {"use_bias": [true, true], "units": [128, 1], "activation": ["swish", "linear"]}}}, "outputs": {"energy": {"name": "energy", "shape": [1]}, "force": {"name": "force", "shape": [null, 3]}}}}, "training": {"fit": {"batch_size": 32, "epochs": 1000, "validation_freq": 1, "verbose": 2, "callbacks": []}, "compile": {"optimizer": {"class_name": "Adam", "config": {"learning_rate": {"class_name": "kgcnn>LinearWarmupExponentialDecay", "config": {"learning_rate": 0.001, "warmup_steps": 150.0, "decay_steps": 20000.0, "decay_rate": 0.01}}, "amsgrad": true, "use_ema": true}}, "loss_weights": {"energy": 0.02, "force": 0.98}}, "scaler": {"class_name": "EnergyForceExtensiveLabelScaler", "config": {"standardize_scale": true}}}, "data": {}, "dataset": {"class_name": "MD17Dataset", "module_name": "kgcnn.data.datasets.MD17Dataset", "config": {"trajectory_name": "benzene_ccsd_t"}, "methods": [{"rename_property_on_graphs": {"old_property_name": "E", "new_property_name": "energy"}}, {"rename_property_on_graphs": {"old_property_name": "F", "new_property_name": "force"}}, {"rename_property_on_graphs": {"old_property_name": "z", "new_property_name": "atomic_number"}}, {"rename_property_on_graphs": {"old_property_name": "R", "new_property_name": "node_coordinates"}}, {"map_list": {"method": "set_range", "max_distance": 5, "max_neighbours": 10000, "node_coordinates": "node_coordinates"}}, {"map_list": {"method": "count_nodes_and_edges", "total_edges": "total_ranges", "count_edges": "range_indices", "count_nodes": "atomic_number", "total_nodes": "total_nodes"}}]}, "info": {"postfix": "", "postfix_file": "_benzene_ccsd_t", "kgcnn_version": "4.0.0"}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
OS: posix_linux
backend: tensorflow
cuda_available: 'True'
data_unit: eV
date_time: '2024-02-02 03:29:06'
device_id: '[LogicalDevice(name=''/device:CPU:0'', device_type=''CPU''), LogicalDevice(name=''/device:GPU:0'',
device_type=''GPU'')]'
device_memory: '[]'
device_name: '[{}, {''compute_capability'': (8, 0), ''device_name'': ''NVIDIA A100
80GB PCIe''}]'
epochs:
- 1000
- 1000
- 1000
- 1000
- 1000
execute_folds:
- 4
kgcnn_version: 4.0.0
learning_rate:
- 1.1979999726463575e-05
- 1.1979999726463575e-05
- 1.1979999726463575e-05
- 1.1979999726463575e-05
- 1.1979999726463575e-05
loss:
- 0.012162698432803154
- 0.011314144358038902
- 0.011874545365571976
- 0.012134171091020107
- 0.013336176052689552
max_learning_rate:
- 0.0010000000474974513
- 0.0010000000474974513
- 0.0010000000474974513
- 0.0010000000474974513
- 0.0010000000474974513
max_loss:
- 0.43559181690216064
- 0.43510064482688904
- 0.4345819354057312
- 0.4330730736255646
- 0.4359922707080841
max_scaled_mean_absolute_error:
- 0.6970012784004211
- 0.6949480772018433
- 0.694709062576294
- 0.6930708885192871
- 0.6976876854896545
max_scaled_root_mean_squared_error:
- 1.0893703699111938
- 1.0879148244857788
- 1.085277795791626
- 1.0855214595794678
- 1.0865780115127563
max_val_loss:
- 0.2306491881608963
- 0.23060937225818634
- 0.23704500496387482
- 0.2425326555967331
- 0.22873705625534058
max_val_scaled_mean_absolute_error:
- 0.36888542771339417
- 0.3682745099067688
- 0.37900540232658386
- 0.38841739296913147
- 0.36602550745010376
max_val_scaled_root_mean_squared_error:
- 0.716799259185791
- 0.7069448828697205
- 0.7239429950714111
- 0.758888840675354
- 0.7186479568481445
min_learning_rate:
- 1.1979999726463575e-05
- 1.1979999726463575e-05
- 1.1979999726463575e-05
- 1.1979999726463575e-05
- 1.1979999726463575e-05
min_loss:
- 0.009732229635119438
- 0.009808916598558426
- 0.00922525953501463
- 0.009409546852111816
- 0.009921926073729992
min_scaled_mean_absolute_error:
- 0.015554945915937424
- 0.015672003850340843
- 0.014760036021471024
- 0.015076831914484501
- 0.015888215973973274
min_scaled_root_mean_squared_error:
- 0.07876173406839371
- 0.08663474768400192
- 0.07469789683818817
- 0.08317063748836517
- 0.08534816652536392
min_val_loss:
- 0.12934552133083344
- 0.12370166927576065
- 0.131747305393219
- 0.12867090106010437
- 0.12415623664855957
min_val_scaled_mean_absolute_error:
- 0.20687255263328552
- 0.19757099449634552
- 0.21063703298568726
- 0.206019327044487
- 0.19860394299030304
min_val_scaled_root_mean_squared_error:
- 0.48725444078445435
- 0.4497235119342804
- 0.5058600902557373
- 0.4878058135509491
- 0.46084362268447876
model_class: make_crystal_model
model_name: CGCNN
model_version: '2023-11-28'
multi_target_indices: null
number_histories: 5
scaled_mean_absolute_error:
- 0.019451702013611794
- 0.018081212416291237
- 0.01900147646665573
- 0.01944182626903057
- 0.02135508507490158
scaled_root_mean_squared_error:
- 0.08296354115009308
- 0.0887678787112236
- 0.08019550889730453
- 0.08724640309810638
- 0.09060211479663849
seed: 42
time_list:
- '14:31:39.401335'
- '14:29:29.974382'
- '14:28:03.272073'
- '14:35:27.258171'
- '14:39:25.346675'
val_loss:
- 0.12934552133083344
- 0.12370166927576065
- 0.131747305393219
- 0.12867090106010437
- 0.12415623664855957
val_scaled_mean_absolute_error:
- 0.20687255263328552
- 0.19757099449634552
- 0.21063703298568726
- 0.206019327044487
- 0.19860394299030304
val_scaled_root_mean_squared_error:
- 0.5045267343521118
- 0.4644222855567932
- 0.5119762420654297
- 0.49901968240737915
- 0.46084362268447876
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": {"class_name": "make_crystal_model", "module_name": "kgcnn.literature.CGCNN", "config": {"name": "CGCNN", "inputs": [{"shape": [null], "name": "node_number", "dtype": "int64", "ragged": true}, {"shape": [null, 3], "name": "node_frac_coordinates", "dtype": "float64", "ragged": true}, {"shape": [null, 2], "name": "range_indices", "dtype": "int64", "ragged": true}, {"shape": [null, 3], "name": "range_image", "dtype": "float32", "ragged": true}, {"shape": [3, 3], "name": "graph_lattice", "dtype": "float64", "ragged": false}], "input_tensor_type": "ragged", "input_node_embedding": {"input_dim": 95, "output_dim": 64}, "representation": "unit", "expand_distance": true, "make_distances": true, "gauss_args": {"bins": 60, "distance": 6, "offset": 0.0, "sigma": 0.4}, "conv_layer_args": {"units": 128, "activation_s": "kgcnn>shifted_softplus", "activation_out": "kgcnn>shifted_softplus", "batch_normalization": true}, "node_pooling_args": {"pooling_method": "mean"}, "depth": 4, "output_mlp": {"use_bias": [true, true, false], "units": [128, 64, 1], "activation": ["kgcnn>shifted_softplus", "kgcnn>shifted_softplus", "linear"]}}}, "training": {"cross_validation": {"class_name": "KFold", "config": {"n_splits": 5, "random_state": 42, "shuffle": true}}, "fit": {"batch_size": 128, "epochs": 1000, "validation_freq": 10, "verbose": 2, "callbacks": [{"class_name": "kgcnn>LinearLearningRateScheduler", "config": {"learning_rate_start": 0.001, "learning_rate_stop": 1e-05, "epo_min": 500, "epo": 1000, "verbose": 0}}]}, "compile": {"optimizer": {"class_name": "Adam", "config": {"learning_rate": 0.001}}, "loss": "mean_absolute_error"}, "scaler": {"class_name": "StandardLabelScaler", "module_name": "kgcnn.data.transform.scaler.standard", "config": {"with_std": true, "with_mean": true, "copy": true}}, "multi_target_indices": null}, "data": {"data_unit": "eV"}, "info": {"postfix": "", "postfix_file": "", "kgcnn_version": "4.0.0"}, "dataset": {"class_name": "MatProjectGapDataset", "module_name": "kgcnn.data.datasets.MatProjectGapDataset", "config": {}, "methods": [{"map_list": {"method": "set_range_periodic", "max_distance": 6.0}}]}}
28 changes: 16 additions & 12 deletions training/results/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ ESOL consists of 1128 compounds as smiles and their corresponding water solubili
| GIN | 4.0.0 | 300 | 0.5369 ± 0.0334 | 0.7954 ± 0.0861 |
| GNNFilm | 4.0.0 | 800 | 0.4854 ± 0.0368 | 0.6724 ± 0.0436 |
| GraphSAGE | 4.0.0 | 500 | 0.4874 ± 0.0228 | 0.6982 ± 0.0608 |
| HamNet | 4.0.0 | 400 | 0.5479 ± 0.0143 | 0.7417 ± 0.0298 |
| HDNNP2nd | 4.0.0 | 500 | 0.7857 ± 0.0986 | 1.0467 ± 0.1367 |
| INorp | 4.0.0 | 500 | 0.5055 ± 0.0436 | 0.7297 ± 0.0786 |
| MAT | 4.0.0 | 400 | 0.5064 ± 0.0299 | 0.7194 ± 0.0630 |
Expand Down Expand Up @@ -135,10 +136,10 @@ Lipophilicity (MoleculeNet) consists of 4200 compounds as smiles. Graph labels f

Energies and forces for molecular dynamics trajectories of eight organic molecules. All geometries in A, energy labels in kcal/mol and force labels in kcal/mol/A. We use preset train-test split. Training on 1000 geometries, test on 500/1000 geometries. Errors are MAE for forces. Results are for the CCSD and CCSD(T) data in MD17.

| model | kgcnn | epochs | Aspirin | Toluene | Malonaldehyde | Benzene | Ethanol |
|:------------------------|:--------|---------:|:-----------------|:--------------------|:-----------------|:-----------------|:--------------------|
| PAiNN.EnergyForceModel | 4.0.0 | 1000 | **nan ± nan** | **0.2815 ± nan** | **nan ± nan** | **nan ± nan** | 0.5805 ± nan |
| Schnet.EnergyForceModel | 4.0.0 | 1000 | 1.2173 ± nan | 0.7395 ± nan | 0.8444 ± nan | 0.3353 ± nan | **0.4832 ± nan** |
| model | kgcnn | epochs | Aspirin | Toluene | Malonaldehyde | Benzene | Ethanol |
|:------------------------|:--------|---------:|:--------------------|:--------------------|:--------------------|:--------------------|:--------------------|
| PAiNN.EnergyForceModel | 4.0.0 | 1000 | **0.8551 ± nan** | **0.2815 ± nan** | **0.7749 ± nan** | **0.0427 ± nan** | 0.5805 ± nan |
| Schnet.EnergyForceModel | 4.0.0 | 1000 | 1.2173 ± nan | 0.7395 ± nan | 0.8444 ± nan | 0.3353 ± nan | **0.4832 ± nan** |

#### MD17RevisedDataset

Expand Down Expand Up @@ -173,9 +174,10 @@ Materials Project dataset from Matbench with 132752 crystal structures and their

Materials Project dataset from Matbench with 106113 crystal structures and their band gap as calculated by PBE DFT from the Materials Project, in eV. We use a random 5-fold cross-validation.

| model | kgcnn | epochs | MAE [eV] | RMSE [eV] |
|:--------------------------|:--------|---------:|:-----------------------|:--------------------------|
| Schnet.make_crystal_model | 4.0.0 | 800 | **1.2226 ± 1.0573** | **58.3713 ± 114.2957** |
| model | kgcnn | epochs | MAE [eV] | RMSE [eV] |
|:--------------------------|:--------|---------:|:-----------------------|:-----------------------|
| CGCNN.make_crystal_model | 4.0.0 | 1000 | **0.2039 ± 0.0050** | **0.4882 ± 0.0213** |
| Schnet.make_crystal_model | 4.0.0 | 800 | 1.2226 ± 1.0573 | 58.3713 ± 114.2957 |

#### MatProjectIsMetalDataset

Expand All @@ -184,7 +186,8 @@ Materials Project dataset from Matbench with 106113 crystal structures and their
| model | kgcnn | epochs | Accuracy | AUC |
|:--------------------------|:--------|---------:|:-----------------------|:-----------------------|
| CGCNN.make_crystal_model | 4.0.0 | 100 | 0.8910 ± 0.0027 | 0.9406 ± 0.0024 |
| Schnet.make_crystal_model | 4.0.0 | 80 | **0.8953 ± 0.0058** | **0.9506 ± 0.0053** |
| Megnet.make_crystal_model | 4.0.0 | 100 | **0.8966 ± 0.0033** | **0.9506 ± 0.0026** |
| Schnet.make_crystal_model | 4.0.0 | 80 | 0.8953 ± 0.0058 | 0.9506 ± 0.0053 |

#### MatProjectJdft2dDataset

Expand Down Expand Up @@ -308,10 +311,11 @@ QM7 dataset is a subset of GDB-13. Molecules of up to 23 atoms (including 7 heav

QM9 dataset of 134k stable small organic molecules made up of C,H,O,N,F. Labels include geometric, energetic, electronic, and thermodynamic properties. We use a random 5-fold cross-validation, but not all splits are evaluated for cheaper evaluation. Test errors are MAE and for energies are given in [eV].

| model | kgcnn | epochs | HOMO [eV] | LUMO [eV] | U0 [eV] | H [eV] | G [eV] |
|:--------|:--------|---------:|:-----------------------|:-----------------------|:-----------------------|:-----------------------|:-----------------------|
| PAiNN | 4.0.0 | 872 | 0.0483 ± 0.0275 | **0.0268 ± 0.0002** | **0.0099 ± 0.0003** | **0.0101 ± 0.0003** | **0.0110 ± 0.0002** |
| Schnet | 4.0.0 | 800 | **0.0402 ± 0.0007** | 0.0340 ± 0.0001 | 0.0142 ± 0.0002 | 0.0146 ± 0.0002 | 0.0143 ± 0.0002 |
| model | kgcnn | epochs | HOMO [eV] | LUMO [eV] | U0 [eV] | H [eV] | G [eV] |
|:--------|:--------|---------:|:-------------------|:-----------------------|:-------------------|:-------------------|:-----------------------|
| Megnet | 4.0.0 | 800 | **nan ± nan** | 0.0407 ± 0.0009 | **nan ± nan** | **nan ± nan** | 0.0169 ± 0.0006 |
| PAiNN | 4.0.0 | 872 | 0.0483 ± 0.0275 | **0.0268 ± 0.0002** | 0.0099 ± 0.0003 | 0.0101 ± 0.0003 | **0.0110 ± 0.0002** |
| Schnet | 4.0.0 | 800 | 0.0402 ± 0.0007 | 0.0340 ± 0.0001 | 0.0142 ± 0.0002 | 0.0146 ± 0.0002 | 0.0143 ± 0.0002 |

#### SIDERDataset

Expand Down

0 comments on commit ca8063a

Please sign in to comment.