A PyTorch implementation of Bayesian Flow Networks
Currently use of a non causal version of LLAMA2. Currently training TinyStories Models matching https://github.com/karpathy/llama2.c/tree/master
I am going to be using this repository to explore training dynamics of this new class of models. I will maintain a minimal implimentation in the Minimal.ipynd for a simple BFN implimentation. Everything else is an early work in progress.
- Discrete model with continuous-time loss, training and sampling (completed)
- SOTA performance on XOR dataset
- Tiny Stories 15m LLAMA2 Initial Code
- Tiny Stories weights (training)
- Wiki Text8 Dataset
- Bayesian Flow GPT-2 Scale
- Fancy Visuals