🚀 HamGNN v2.1 Now Available!

Getting Started with HamGNN: Online Documentation

1. Introduction to HamGNN
2. Environment Configuration Requirements
3. HamGNN Installation Steps
- Step One: Install Conda Environment
- Step Two: Install HamGNN from Source
4. Construction Method for graph_data.npz Files
5. Model Training Process
6. Model Prediction Operations
7. Band Structure Calculation
- Operation Process
- Band Calculation for Large Systems
8. Introduction to and Usage of Uni-HamGNN Universal Model
9. HamGNN Parameter Details
References
Code contributors
Project leaders

1. Introduction to HamGNN

HamGNN (Hamiltonian Graph Neural Network) is an E(3) equivariant graph neural network framework designed specifically for quantum materials simulation. Its core functionality is to train and predict ab initio tight-binding Hamiltonians, which are fundamental quantum mechanical operators describing the electronic structure of materials.

Key features of HamGNN include:

Support for a wide range of physical scenarios: Applicable to molecules, solids, and multi-dimensional material systems with Spin-Orbit Coupling (SOC) effects, capable of handling structures from zero to three dimensions.
Compatibility with multiple DFT software: Supports integration with mainstream density functional theory software based on Numerical Atomic Orbitals (NAO), including OpenMX, SIESTA/HONPAS, and ABACUS.
Theoretical foundation: Employs a graph neural network architecture with E(3) rotational and translational equivariance, ensuring the predicted Hamiltonian matrices satisfy fundamental physical symmetries.
High fidelity: Capable of approximating density functional theory (DFT) calculation results with high precision, while maintaining good cross-material structure prediction capabilities.
Computational efficiency: Significantly improves computational efficiency for large-scale systems (such as systems containing thousands of atoms) compared to traditional DFT methods.

HamGNN's application domains primarily focus on high-throughput material design and discovery, large-scale electronic structure calculations, and quantum material property predictions, providing an efficient research tool for materials science, condensed matter physics, and computational chemistry.

According to recent research advances, HamGNN has been extended to a universal model called Uni-HamGNN, which can predict spin-orbit coupling effects across the periodic table without the need for retraining for new material systems, significantly accelerating the discovery and design process of quantum materials.

2. Environment Configuration Requirements

Python Environment

The HamGNN framework recommends Python 3.9 and depends on the following key Python libraries:

numpy == 1.21.2
PyTorch == 1.11.0
PyTorch Geometric == 2.0.4
pytorch_lightning == 1.5.10
e3nn == 0.5.0
pymatgen == 2022.3.7
tensorboard == 2.8.0
tqdm
scipy == 1.7.3
yaml

Third-party DFT Tool Support

OpenMX

OpenMX: HamGNN requires tight-binding Hamiltonians generated by OpenMX. Users should be familiar with basic OpenMX parameter settings and usage methods. Available for download from OpenMX official website.
openmx_postprocess: This is a modified version of OpenMX used to parse calculated overlap matrices and other Hamiltonian matrices. It stores computational data in a binary file named overlap.scfout.
read_openmx: This is a binary executable used to export matrices from the overlap.scfout file to HS.json.

SIESTA/HONPAS

honpas_1.2_H0: This is a modified version of HONPAS used to parse calculated overlap matrices and non-self-consistent Hamiltonian matrices H0, similar to the openmx_postprocess tool. The output is a binary file containing Hamiltonian data (overlap.HSX).
hsxdump: This is a binary executable that generates intermediate Hamiltonian files, essential for converting HONPAS output to HamGNN-readable format.

ABACUS

abacus_postprocess: A tool used to export Hamiltonian matrices H0.

Compiling `openmx_postprocess` and `read_openmx`

To install openmx_postprocess:

First, install the GSL library.
Modify the makefile in the openmx_postprocess directory:
- Set GSL_lib to the path of the GSL library.
- Set GSL_include to the include path of GSL.
- Set MKLROOT to the Intel MKL path.
- Set CMPLR_ROOT to the Intel compiler path. After modifying the makefile, execute make to generate the executable programs: openmx_postprocess and read_openmx.

3. HamGNN Installation Steps

Step One: Install Conda Environment

To avoid library version conflicts, it is recommended to use one of the following two methods:

Method 1: Use Pre-built Environment (Recommended)

Download the pre-built HamGNN Conda environment (ML.tar.gz) from Zenodo
Extract it to the envs folder in your Conda installation directory:
```
tar -xzvf ML.tar.gz -C $HOME/miniconda3/envs/
```
Activate the environment:
```
conda activate ML
```

Method 2: Create Environment Using Configuration File

Create an environment using the YAML configuration file provided by HamGNN:
```
conda env create -f ./HamGNN.yml
```
Install additional PyTorch Geometric dependencies (versions must match your PyTorch version):
```
pip install torch-scatter==2.1.2+pt25cu121 torch-sparse==0.6.18+pt25cu121 -f https://data.pyg.org/whl/torch-2.5.1+cu121.html
```
Note: The version numbers in the link must match your actual PyTorch and CUDA versions

Step Two: Install HamGNN from Source

Clone the HamGNN repository:

git clone https://github.com/QuantumLab-ZY/HamGNN.git

Enter the HamGNN directory and execute the installation:
```
cd HamGNN
python setup.py install
```

Verify that the installation was successful:

python -c "import HamGNN_v_2_1; print('HamGNN installed successfully')"

To upgrade HamGNN version, first uninstall the old version:
```
pip uninstall HamGNN
```
Ensure that related files in the site-packages directory (such as HamGNN-x.x.x-py3.9.egg/HamGNN) have been completely removed, then reinstall the new version.

4. Construction Method for graph_data.npz Files

graph_data.npz is the core input format for HamGNN, containing material structure information and Hamiltonian matrix data. Below is an introduction to its construction method.

General Process

Regardless of which DFT software is used, the basic process for constructing graph_data.npz is as follows:

Structure File Conversion: Convert material structure files (such as POSCAR, CIF) to the input format of the corresponding DFT software
Non-self-consistent Hamiltonian Calculation: Use post-processing tools to generate files containing non-self-consistent Hamiltonian H0 (matrices that do not depend on self-consistent charge density, such as kinetic energy matrices)
DFT Calculation (optional): Run a complete DFT calculation to obtain the true Hamiltonian (needed for training set construction, can be skipped for pure prediction)
Data Packaging: Call the adapted graph_data_gen script to package the structure and Hamiltonian matrix data into graph_data.npz

OpenMX Process

Structure Conversion: Edit the poscar2openmx.yaml configuration file, set appropriate paths and DFT parameters (for non-SOC data, set scf.SpinPolarization: Off and scf.SpinOrbit.Coupling: off; for SOC data, set scf.SpinPolarization: nc and scf.SpinOrbit.Coupling: on):

system_name: 'Si'
poscar_path: "/path/to/poscar/*.vasp" # Path to POSCAR or CIF files
filepath: '/path/to/save/dat/file' # Directory to save OpenMX files
basic_command: |+  # OpenMX calculation parameters
  #
  #      File Name      
  #
  System.CurrrentDirectory         ./    # default=./
  System.Name                     Si
  DATA.PATH           /path/to/DFT_DATA   # default=../DFT_DATA19
  # ...other OpenMX parameters...

Non-self-consistent Hamiltonian Calculation: Run openmx_postprocess on the generated .dat files:
```
mpirun -np ncpus ./openmx_postprocess openmx.dat
```
This will generate an overlap.scfout file, containing overlap matrix and non-self-consistent Hamiltonian information.
DFT Calculation (optional, required for training sets):
```
mpirun -np ncpus openmx openmx.dat > openmx.std
```
This will generate a .scfout file containing self-consistent Hamiltonian.

Data Packaging: Configure graph_data_gen.yaml:

nao_max: 26  # Maximum atomic orbital number
graph_data_save_path: '/path/to/save/graph_data'
read_openmx_path: '/path/to/HamGNN/utils_openmx/read_openmx'
max_SCF_skip: 200
scfout_paths: '/path/to/scfout/files'  # Directory of .scfout files
dat_file_name: 'openmx.dat'
std_file_name: 'openmx.std'  # Set to null if no DFT calculation
scfout_file_name: 'openmx.scfout'  # Use 'overlap.scfout' for prediction
soc_switch: False  # Set to True for SOC data

Then run:

graph_data_gen --config graph_data_gen.yaml

SIESTA/HONPAS Process

Structure Conversion: Edit the poscar2siesta.py script with poscar_path and filepath parameters, then run:
```
python poscar2siesta.py
```
DFT Calculation: Use HONPAS to perform DFT calculations and obtain the .HSX Hamiltonian matrix files.
Non-self-consistent Hamiltonian Calculation: Run the following command to generate non-self-consistent Hamiltonian and overlap matrices:
```
mpirun -np Ncores honpas_1.2_H0 < input.fdf
```
This will generate an overlap.HSX file.
Data Packaging: Modify parameters in the graph_data_gen_siesta.py script, then run:
```
python graph_data_gen_siesta.py
```

ABACUS Process

Structure Conversion: Use poscar2abacus.py to generate ABACUS input files.
DFT Calculation and Post-processing: Run ABACUS calculations and use abacus_postprocess to extract H0 matrices.
Data Packaging: Use the graph_data_gen_abacus.py script to integrate data and generate graph_data.npz.

5. Model Training Process

HamGNN training typically consists of two phases: primary training (Hamiltonian optimization) and optional secondary training (including band energy optimization).

Training Mode Classification

Primary Training:
- Uses only Hamiltonian loss function to train the model
- Trains until Hamiltonian error reaches about 10^-5 Hartree
- If accuracy already meets requirements, secondary training may not be necessary
Secondary Training (optional):
- Based on primary training, adds band energy loss function
- Uses a smaller learning rate to fine-tune the model
- Improves model performance in band structure prediction

Training Commands

HamGNN --config config.yaml > log.out 2>&1

For cluster environments, job scheduling systems (such as SLURM) can be used to submit training tasks. Example script:

#!/bin/bash
#SBATCH --job-name=HamGNN_train
#SBATCH --partition=gpu
#SBATCH --time=999:00:00
#SBATCH --mem=100G
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --gres=gpu:1
export OMP_NUM_THREADS=8
source /path/to/your/miniconda3/bin/activate your_env_name
HamGNN --config ./config.yaml > log1.out 2>&1

Key Configuration Item Descriptions

Below are descriptions of key configuration items in config.yaml:

dataset_params:

dataset_params:
  batch_size: 1  # Number of samples processed per batch
  test_ratio: 0.1  # Test set ratio
  train_ratio: 0.8  # Training set ratio
  val_ratio: 0.1  # Validation set ratio
  graph_data_path: './Examples/pentadiamond/'  # Path to graph_data.npz file

losses_metrics:

losses_metrics:
  losses:  # Loss function definition
  - loss_weight: 27.211  # Hamiltonian loss weight
    metric: mae  # Mean absolute error
    prediction: Hamiltonian
    target: hamiltonian
  # Uncomment for secondary training
  #- loss_weight: 0.27211  # Band energy loss weight (typically 0.001~0.01 of Hamiltonian weight)
  #  metric: mae
  #  prediction: band_energy
  #  target: band_energy
  metrics:  # Evaluation metric definition
  - metric: mae
    prediction: hamiltonian
    target: hamiltonian
  # Uncomment for secondary training
  #- metric: mae
  #  prediction: band_energy
  #  target: band_energy

optim_params:

optim_params:
  lr: 0.01  # Learning rate (recommended 0.01 for primary training, 0.0001 for secondary training)
  lr_decay: 0.5  # Learning rate decay rate
  lr_patience: 4  # Number of epochs to wait before adjusting learning rate
  gradient_clip_val: 0.0  # Gradient clipping value
  max_epochs: 3000  # Maximum number of training epochs
  min_epochs: 30  # Minimum number of training epochs
  stop_patience: 10  # Number of epochs to wait for early stopping

setup:

setup:
  GNN_Net: HamGNN_pre  # Type of network to use
  accelerator: null  # Accelerator type
  ignore_warnings: true  # Whether to ignore warnings
  checkpoint_path: /path/to/ckpt  # Checkpoint path
  load_from_checkpoint: false  # Whether to load model parameters from checkpoint
  resume: false  # Whether to continue training from interruption
  num_gpus: [0]  # GPU device numbers to use, null indicates CPU
  precision: 32  # Computation precision (32 or 64 bit)
  property: Hamiltonian  # Type of physical quantity output
  stage: fit  # Stage: fit (training) or test (testing)

output_nets:

output_nets:
  output_module: HamGNN_out
  HamGNN_out:
    ham_type: openmx  # Type of Hamiltonian to fit: openmx or abacus
    nao_max: 19  # Maximum atomic orbital number (14/19/26 for openmx)
    add_H0: true  # Whether to add non-self-consistent Hamiltonian
    symmetrize: true  # Whether to apply Hermitian constraints to Hamiltonian
    calculate_band_energy: false  # Whether to calculate bands (set to true for secondary training)
    #soc_switch: false  # Whether to fit SOC Hamiltonian
    # Parameters used in secondary training
    #num_k: 4  # Number of k-points used for band calculation
    #band_num_control: 8  # Number of orbitals considered in band calculation
    #k_path: null # Generate random k-points

Training Monitoring

Use TensorBoard to monitor the training process:

tensorboard --logdir train_dir --port=6006

When training on a remote server, you can access TensorBoard through an Xshell tunnel:

In Xshell, click "Server->Properties->Tunneling", add a new tunnel
Set source host to localhost, port to 16006
Set target host to localhost, target port to 6006
Access http://localhost:16006/ in your browser to view training progress

6. Model Prediction Operations

After completing model training, you can use the trained model to predict Hamiltonians for new structures.

Preparation Before Prediction

Construct graph_data.npz for structures to be predicted:
- Convert structures to the corresponding DFT software format
- Use post-processing tools (such as openmx_postprocess) to generate overlap.scfout
- Configure graph_data_gen.yaml, set scfout_file_name to 'overlap.scfout'
- Run graph_data_gen to generate graph_data.npz

Edit configuration file: Modify the following parameters in config.yaml:

setup:
  checkpoint_path: /path/to/trained/model.ckpt  # Path to trained model
  num_gpus: null  # Set to null or 0 to use CPU for prediction
  stage: test  # Set to test for prediction mode

Set environment variables: If running on CPU, you can accelerate with multithreading:
```
export OMP_NUM_THREADS=64  # Set number of threads
```

Execute Prediction

HamGNN --config config.yaml > predict.log 2>&1

For large systems or batch predictions, it is recommended to use a job scheduling system to submit tasks:

#!/bin/bash
#SBATCH --partition=compute
#SBATCH --job-name=HamGNN_predict
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=64
#SBATCH --exclusive
export OMP_NUM_THREADS=64
source /path/to/your/miniconda3/bin/activate your_env_name
HamGNN --config ./config.yaml > predict.log 2>&1

Output Results

After prediction is completed, the following files will be generated in the directory specified by train_dir:

prediction_hamiltonian.npy: Predicted Hamiltonian matrix
target_hamiltonian.npy: If the input graph_data.npz contains the real Hamiltonian matrix, this will be the input real value

7. Band Structure Calculation

Band structure calculation is used to verify the accuracy of predicted Hamiltonians, checking whether they can correctly reproduce the electronic band structure of materials.

Operation Process

Edit configuration file: Create a band_cal.yaml file, specifying the following parameters:

nao_max: 26
graph_data_path: '/path/to/graph_data.npz' # Path to graph_data.npz
hamiltonian_path: '/path/to/prediction_hamiltonian.npy'  # Path to Hamiltonian matrix
nk: 120          # the number of k points
save_dir: '/path/to/save/band/calculation/result' # The directory to save the results
strcture_name: 'Si'  # The name of each cif file saved is strcture_name_idx.cif after band calculation
soc_switch: False
spin_colinear: False
auto_mode: True # If the auto_mode is used, users can omit providing k_path and label, as the program will automatically generate them based on the crystal symmetry.
k_path: [[0.,0.,-0.5],[0.,0.,0.0],[0.,0.,0.5]] # High symmetry point path
label: ['$Mbar$','$G$','$M$'] # The lable for each k points in K_path

Run calculation:
```
band_cal --config band_cal.yaml
```

Band Calculation for Large Systems

For large systems, a parallel version of the band calculation tool band_cal_parallel can be used:

Install parallel tools:

pip install mpitool-0.0.1-cp39-cp39-manylinux1_x86_64.whl
pip install band_cal_parallel-0.1.12-py3-none-any.whl

Run parallel calculation:

mpirun -np N_CORES band_cal_parallel --config band_cal_parallel.yaml

Note: Some MKL environments may encounter the Intel MKL FATAL ERROR: Cannot load symbol MKLMPI_Get_wrappers error. Refer to GitHub Issues #18 and #12 for solutions.

8. Introduction to and Usage of Uni-HamGNN Universal Model

Model Introduction

Uni-HamGNN is a universal spin-orbit coupling Hamiltonian model designed to accelerate quantum material discovery. The model addresses the major challenge of modeling spin-orbit coupling (SOC) effects in various complex systems, which traditionally requires computationally expensive density functional theory (DFT) calculations.

Uni-HamGNN eliminates the need for system-specific retraining and costly SOC-DFT calculations, making high-throughput screening of quantum materials possible across systems of different dimensions. This makes it a powerful tool for quantum material design and property research, significantly accelerating the pace of discovery in condensed matter physics and materials science.

Input Requirements

The universal SOC Hamiltonian model requires two graph_data.npz files as input data:

One file in non-SOC mode
One file in SOC mode The preparation method for these files is the same as described in previous sections, but note the differences between SOC and non-SOC parameters.

Usage Process

Prepare input data:
- Convert structures to be predicted into two graph_data.npz files (non-SOC and SOC modes)
- Place these files in the specified directories

Configure parameters: Edit the Input.yaml configuration file:

# HamGNN prediction configuration
model_pkl_path: '/path/to/universal_model.pkl'
non_soc_data_dir: '/path/to/non_soc_graph_data'
soc_data_dir: '/path/to/soc_graph_data'  # Optional, only needed for SOC calculations
output_dir: './results'
device: 'cuda'  # Use GPU, fall back to CPU if not available
calculate_mae: true  # Calculate mean absolute error

Run prediction:

python Uni-HamiltonianPredictor.py --config Input.yaml

Output Results

After execution, the script will generate the following files in the specified output_dir:

hamiltonian.npy: Predicted Hamiltonian matrix in NumPy array format
If calculate_mae is enabled, MAE statistics will be printed to the console

9. HamGNN Parameter Details

This section provides detailed explanations of the parameter modules and parameters in the config.yaml configuration file.

setup (Basic Settings)

Parameter	Type	Description	Default/Recommended Value
`GNN_Net`	string	Type of GNN network to use	`HamGNN_pre` for normal Hamiltonian fitting, `HamGNN_pre_charge` for charged defect Hamiltonian fitting
`accelerator`	null or string	Accelerator type	`null`
`ignore_warnings`	boolean	Whether to ignore warnings	`true`
`checkpoint_path`	string	Checkpoint path for resuming training or for testing	No default value, must be set manually
`load_from_checkpoint`	boolean	Whether to load model parameters from checkpoint	`false` (for new training), `true` (when loading pre-trained model)
`resume`	boolean	Whether to continue training from last interruption	`false` (for new training), `true` (to continue training)
`num_gpus`	null, integer, or list	Number or ID of GPUs to use	`null` (CPU), `[0]` (use first GPU)
`precision`	integer	Computation precision	`32` (32-bit precision), optional `64` (64-bit precision)
`property`	string	Type of physical quantity output by the network	`Hamiltonian` (Hamiltonian)
`stage`	string	Execution stage	`fit` (training), `test` (testing/prediction)

dataset_params (Dataset Parameters)

Parameter	Type	Description	Default/Recommended Value
`batch_size`	integer	Number of samples processed per batch	`1` (typically kept at 1)
`test_ratio`	float	Proportion of test set in the entire dataset	`0.1` (10%)
`train_ratio`	float	Proportion of training set in the entire dataset	`0.8` (80%)
`val_ratio`	float	Proportion of validation set in the entire dataset	`0.1` (10%)
`graph_data_path`	string	Directory of processed compressed graph data files	No default value, must be set manually

losses_metrics (Loss Functions and Evaluation Metrics)

Parameter	Type	Description	Default/Recommended Value
`losses`	list	List of loss function definitions	Must include at least Hamiltonian loss
`losses[].loss_weight`	float	Loss function weight	`27.211` (Hamiltonian), `0.27211` (band energy)
`losses[].metric`	string	Loss calculation method	`mae` (mean absolute error), optional `mse` (mean squared error) or `rmse` (root mean squared error)
`losses[].prediction`	string	Prediction output	`Hamiltonian` or `band_energy`
`losses[].target`	string	Target data	`hamiltonian` or `band_energy`
`metrics`	list	List of evaluation metric definitions	Usually the same as losses

optim_params (Optimizer Parameters)

Parameter	Type	Description	Default/Recommended Value
`lr`	float	Learning rate	`0.01` (primary training), `0.0001` (secondary training)
`lr_decay`	float	Learning rate decay factor	`0.5`
`lr_patience`	integer	Number of epochs to wait before triggering learning rate decay	`4`
`gradient_clip_val`	float	Gradient clipping value	`0.0` (no clipping)
`max_epochs`	integer	Maximum number of training epochs	`3000`
`min_epochs`	integer	Minimum number of training epochs	`30`
`stop_patience`	integer	Number of epochs to wait for early stopping	`10`

output_nets.HamGNN_out (Output Network Parameters)

Parameter	Type	Description	Default/Recommended Value
`ham_type`	string	Type of Hamiltonian to fit	`openmx` (OpenMX Hamiltonian), `abacus` (ABACUS Hamiltonian)
`nao_max`	integer	Maximum number of atomic orbitals	`14` (short-period elements), `19` (common elements), `26` (all elements supported by OpenMX); for ABACUS, `27` or `40` are options
`add_H0`	boolean	Whether to add predicted H_scf to H_nonscf	`true`
`symmetrize`	boolean	Whether to apply Hermitian constraints to Hamiltonian	`true`
`calculate_band_energy`	boolean	Whether to calculate bands for band training	`false` (primary training), `true` (secondary training)
`num_k`	integer	Number of k-points used for band calculation	`4`
`band_num_control`	integer, dictionary, or null	Controls the number of orbitals considered in band calculation	`8` (VBM±8 bands), `dict` (specifies basis number for each atom type), `null` (all bands)
`soc_switch`	boolean	Whether to fit SOC Hamiltonian	`false`
`nonlinearity_type`	string	Type of non-linear activation function	`gate`
`zero_point_shift`	boolean	Whether to apply zero-point potential correction to Hamiltonian matrix	`true`
`spin_constrained`	boolean	Whether to constrain spin	`false`
`collinear_spin`	boolean	Whether it is collinear spin	`false`
`minMagneticMoment`	float	Minimum magnetic moment	`0.5`

representation_nets.HamGNN_pre (Representation Network Parameters)

Parameter	Type	Description	Default/Recommended Value
`cutoff`	float	Cutoff radius for interatomic distances	`26.0`
`cutoff_func`	string	Type of distance cutoff function	`cos` (cosine function), optional `pol` (polynomial function)
`edge_sh_normalization`	string	Normalization method for edge spherical harmonics	`component`
`edge_sh_normalize`	boolean	Whether to normalize edge spherical harmonics	`true`
`irreps_edge_sh`	string	Spherical harmonic representation of edges	`0e + 1o + 2e + 3o + 4e + 5o`
`irreps_node_features`	string	O(3) irreducible representation of initial atomic features	`64x0e+64x0o+32x1o+16x1e+12x2o+25x2e+18x3o+9x3e+4x4o+9x4e+4x5o+4x5e+2x6e`
`num_layers`	integer	Number of interaction or orbital convolution layers	`3`
`num_radial`	integer	Number of Bessel bases	`64`
`num_types`	integer	Maximum number of atom types	`96`
`rbf_func`	string	Type of radial basis function	`bessel`
`set_features`	boolean	Whether to set features	`true`
`radial_MLP`	list	Hidden layer sizes of the radial multilayer perceptron	`[64, 64]`
`use_corr_prod`	boolean	Whether to use correlation product	`false`
`correlation`	integer	Correlation parameter	`2`
`num_hidden_features`	integer	Number of hidden features	`16`
`use_kan`	boolean	Whether to use KAN activation function	`false`
`radius_scale`	float	Radius scaling factor	`1.01`
`build_internal_graph`	boolean	Whether to build internal graph	`false`

Parameter Adjustment Recommendations

Initial Configuration:
- When using for the first time, it is recommended to first use default parameters for primary training, then adjust based on needs after observing the effect
Learning Rate Adjustment:
- For primary training, set initial learning rate to 0.01
- For secondary training, set initial learning rate to 0.0001
- If training is unstable, lower the learning rate; if convergence is too slow, appropriately increase the learning rate
Network Depth Adjustment:
- For simple systems, num_layers=3 is usually sufficient
- For complex systems, try increasing num_layers to 4-5
Orbital Number Adjustment:
- nao_max needs to be chosen based on the maximum atomic orbital number in the system
- Use 14 for short-period elements (such as C, Si, O, etc.)
- Use 19 for most common elements
- Use 26 for all elements supported by OpenMX
Cutoff Radius Adjustment:
- The default value of cutoff 26.0 is suitable for most systems, too small a cutoff sometimes has poor effect
- Note that cutoff here only controls the decay factor of interatomic interactions, not affecting the structure of the graph, i.e., not changing the number of edges
Loss Function Weight Adjustment:
- In secondary training, the loss function weight for band_energy is typically set to 0.001~0.01 times that of hamiltonian
- If band fitting effect is poor, the weight of band_energy can be appropriately increased, but should not be too large to avoid affecting the prediction accuracy of the Hamiltonian
Minimum Irreps for Node and Edge Features in config.yaml：
- The different atomic orbital basis sets used to expand the Hamiltonian matrix may require different combinations of $l$ in the equivariant features, and the number of channels may also vary. If the basis set contains f orbitals, the maximum value of $l$ in the equivariant feature is as high as 6. The minimum equivariant feature settings can be determined through the following code. For example, for the $ssppd$ basis set, the minimum irreducible representations are 17x0e+20x1o+8x1e+8x2o+20x2e+8x3o+4x3e+4x4e:
```
from e3nn import o3

row=col=o3.Irreps("1x0e+1x0e+1x0e+1x1o+1x1o+1x2e+1x2e") # for 'sssppd'
ham_irreps_dim = []
ham_irreps = o3.Irreps()

for _, li in row:
    for _, lj in col:
        for L in range(abs(li.l-lj.l), li.l+lj.l+1):
            ham_irreps += o3.Irrep(L, (-1)**(li.l+lj.l)) 

print(ham_irreps.sort()[0].simplify())
```
```
Output: 17x0e+20x1o+8x1e+8x2o+20x2e+8x3o+4x3e+4x4e
```

References

The papers related to HamGNN:

Code contributors:

Yang Zhong (Fudan University)
Changwei Zhang (Fudan University)
Zhenxing Dai (Fudan University)
Shixu Liu (Fudan University)
Hongyu Yu (Fudan University)
Yuxing Ma (Fudan University)

Project leaders:

Hongjun Xiang (Fudan University)
Xingao Gong (Fudan University)

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
HamGNN_v_2_1		HamGNN_v_2_1
Uni-HamGNN		Uni-HamGNN
band_cal_parallel		band_cal_parallel
config_examples		config_examples
docs		docs
logo		logo
utils_abacus		utils_abacus
utils_openmx		utils_openmx
utils_siesta		utils_siesta
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
HamGNN.yaml		HamGNN.yaml
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

License

QuantumLab-ZY/HamGNN

Folders and files

Latest commit

History

Repository files navigation

🚀 HamGNN v2.1 Now Available!

Table of Contents

1. Introduction to HamGNN

2. Environment Configuration Requirements

Python Environment

Third-party DFT Tool Support

OpenMX

SIESTA/HONPAS

ABACUS

Compiling openmx_postprocess and read_openmx

3. HamGNN Installation Steps

Step One: Install Conda Environment

Method 1: Use Pre-built Environment (Recommended)

Method 2: Create Environment Using Configuration File

Step Two: Install HamGNN from Source

4. Construction Method for graph_data.npz Files

General Process

OpenMX Process

SIESTA/HONPAS Process

ABACUS Process

5. Model Training Process

Training Mode Classification

Training Commands

Key Configuration Item Descriptions

Training Monitoring

6. Model Prediction Operations

Preparation Before Prediction

Execute Prediction

Output Results

7. Band Structure Calculation

Operation Process

Band Calculation for Large Systems

8. Introduction to and Usage of Uni-HamGNN Universal Model

Model Introduction

Input Requirements

Usage Process

Output Results

9. HamGNN Parameter Details

setup (Basic Settings)

dataset_params (Dataset Parameters)

losses_metrics (Loss Functions and Evaluation Metrics)

optim_params (Optimizer Parameters)

output_nets.HamGNN_out (Output Network Parameters)

representation_nets.HamGNN_pre (Representation Network Parameters)

Parameter Adjustment Recommendations

References

Code contributors:

Project leaders:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Compiling `openmx_postprocess` and `read_openmx`

Packages