Wupper is designed by Nikhef (Amsterdam, The Netherlands) for the CERN ATLAS / FELIX project. Its main purpose is to provide a simple Direct Memory Access (DMA) interface to the Xilinx Virtex-7 PCIe Gen3 hard block. Wupper is specifically designed for the 256 bit wide AXI4-Stream interface of the Xilinx Virtex-7 FPGA Gen3 Integrated Block for PCI Express (PCIe). Wupper has been also successfully ported to Xilinx Kintex UltraScale FPGAs.
The main purpose of Wupper is therefore to provide an interface to standard FIFOs. This is the done by the DMA_read_write block in the diagram above. The read/write FIFOs have the same width as the Xilinx AXI4-Stream interface (256 bits for PCIe Gen3 and 512 bits for PCIe Gen4 inter) and run at 250 MHz. The application side of the FPGA design can simply read or write the FIFOs. Wupper will handle the transfer to Host PC memory, according to the addresses specified in the DMA descriptors.
Another functionality of Wupper is thus to manage a set of DMA descriptors. Descriptors consist of an address, a read/write flag, the transfer size (number of 32 bit words) and an enable line. Descriptors are handled by the DMA_control block. These descriptors are mapped as normal PCIe memory or IO registers. Besides the descriptors and the enable line (one per descriptor), a status register for every descriptor is provided in the register map.
Besides DMA specific functions, the DMA control block can also handle generic control and monitor registers for user application.
Wupper is provided with a generic MSI-X compatible interrupt controller.
- For synthesis and implementation of the cores, it is recommend to use Xilinx Vivado 2020.1.
- Other IP cores (clock wizard and PCIe) are provided in the Xilinx .xci format, as well as the constraints file (.xdc) is in the Vivado 2020.1 Format, FIFO cores are using XPM macros.
- For Versal devices, Vivado 2020.1 is used as well, but there may be better support in later Vivado editions. We will upgrade later.
- The Virtex Ultrascale+ VU9P works with PCIe Gen4 however this was officially dropped by Xilinx. To use these devices with Gen4, Vivado 2018.1 has to be used.
For portability reasons, no Xilinx project files will be supplied with Wupper. Instead, a bundle of TCL scripts has been supplied to create a project and import all necessary files, as well as to do the synthesis and implementation. These scripts are be described in details in the /documentation/wupper.pdf distributed with Wupper.
The Wupper has been tested on a VCU128 FPGA on a Gigabyte X570 AORUS PRO motherboard. There are 4 conditions to test the device:
- Bifurcation enabled in BIOS with x8x8 configuration.
- IOMMU disabled in BIOS
- VCU128 in PCIEX16 slot.
- No card on PCIEX8 slot.
Two devices populate in the PCIe bus for the VCU128. The wupper driver creates two files, /dev/wupper0 and /dev/wupper1. Each of them correspond to a PCIe device, connected in the FPGA to a pair of 512b-width FIFO (one for RX and one for TX), making a total of 1024b communication path at 250MHz for a theretical maximum burst of 256Gbps, greater than the 252Gbps theoretical maximum of PCIe Gen4 x16.
>> Give comments and feedback using the official core thread on the OpenCores forum: forum_thread