Skip to content

cjg91/trans-fat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

trans-fat

An FPGA Accelerator for Transformer Inference

We accelerated a BERT layer across two FPGAs, partitioned into four pipeline stages. We conduct three levels of optimization using Vitis HLS and report runtimes. The accelerator implements a transformer layer of standard BERT size, with a sequence length of 128 (which can be modified).

Instructions

This repository is designed to run on a host node with at least two Xilinx u200s. The instructions provided are specific to the the Pitt CRC fpga-n0 node, however, they may be adapted as neded for other nodes.

Dependancies

The required dependancies can be loaded using the following commands.

module load xilinx/vitis/2020.2
module load libfaketime
source /opt/xilinx/xrt/setup.sh

Building

All building is performed in the fpga/ directory. Navigate there and enter the following command.

faketime 'last year' make all TARGET=<hw, hw_emu, sw_emu> VERSION=<0, 1, 2, 3> PART=<fpga1, fpga2, all> JOBS=<# of jobs requested>

If building for hardware the output artifacts will automatically be coppied into /builds/v#/fpga#/.

Running

To run all enter make test VERSION=<0, 1, 2, 3> PART=all in the fpga/ directory.

Individual fpga builds can be run directly using the host and executable in the desired builds/ directory.

Optimization Versions

v0

  • None

v1

  • Linear layer tiling
  • Buffering of input and output data
  • Unrolling of multiplication inner loops

v2

  • Transpose A matmul input
  • Cache line of A.T
  • Increase tile size in j dimension
  • Unrolling of computation in attention heads

v3

  • Stream DDR inputs/outputs in linear layers

Results

Version Latency (ms)
fpga1 fpga2 all
v0 4723.71 10950.90 15676.30
v1 274.98 120.91 397.45
v2 48.36 95.60 145.27
v3 35.03 71.76 110.99

About

An FPGA Accelerator for Transformer Inference

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •