The Livermore Big Artificial Neural Network toolkit (LBANN) is an open-source, HPC-centric, deep learning training framework that is optimized to compose multiple levels of parallelism.
LBANN provides model-parallel acceleration through domain decomposition to optimize for strong scaling of network training. It also allows for composition of model-parallelism with both data parallelism and ensemble training methods for training large neural networks with massive amounts of data. LBANN is able to advantage of tightly-coupled accelerators, low-latency high-bandwidth networking, and high-bandwidth parallel file systems.
DGraph is deep learning library for training graph neural networks at scale that is built on top of PyTorch.
To install DGraph, clone the repository and install with pip:
pip install -e .[ogb]
To run the tests, use the following command:
python -m pytest tests/
DGraph requires the following packages:
- PyTorch >= 2.1.0
- NumPy
- pytest
- mpi4py
For the full list of requirements, see requirements.txt
.
DGraph also requires the following libraries:
- NCCL
- NVSHMEM
A list of publications, presentations and posters are shown here.
Issues, questions, and bugs can be raised on the Github issue tracker.