This is the official efficient real-time HW/SW Co-design for quantized two-stream CNN of Human Action Recognition (FPGA-QHAR) on PYNQ SoC-FPGAs that is accepted as a conference paper in the IEEE Xplore Digital Library as FPGA-QHAR: Throughput-Optimized for Quantized Human Action Recognition on The Edge, and will be presented in December 2023 at the IEEE 20th International Conference on SmartCommunities: Improving Quality of Life Using AI, Robotics and IoT.
This paper proposed an end-to-end efficient customized and quantized Two-Stream HAR SimpleNet-PyTorch CNN architecture trained on UCF101 & UCF24 datasets and implemented as HW/SW co-design on AMD PYNQ SoC-FPGAs using partially streaming dataflow architecture that achieved real-time performance of 24FPS with 81% prediction accuracy on connected camera.
- Developed a scalable inference accelerator for QHAR on top of SimpleNet-PyTorch CNN & NetDBFPGA.
- The developed network accelerator fused all convolutional, batch-norm, and ReLU operations into a single homogeneous layer and utilized the Lucas-Kanade motion flow method to enable an optimized on-chip engine computing on FPGA, while GPU, CPU, and Jetson don't have this capability.
- Provided a complete open-source framework (training to implementation stack) for QHAR on SoC-FPGA and different hardware platforms.
- Demonstrated that the small version of UCF101 which is UCF24 datasets effect positively the performance & accuracy, resource utilization, and throughput.
- The community can build upon our code, explore, and search efficient implementation of Multimodal fusion for comprehensive ADAS with HAR action understanding on low-power FPGAs which are considered as a solution for a wide range of Autonomous applications.
- Local Nvidia GPU or Google Cloud GPU with Colab
- Linux Ubuntu 18.04
- Python 3.7+
- Pytorch v1.12.0+
- Vivado 2018.3
- PYNQ framework 2.6
- AMD SoC-FPGAs Pynq supported (ex: Kria KV260 & ZCU104)
PyTorch
folder for training.HLS
folder for the synthesis of the accelerator.PYNQ_Hardware
folder for deployment on xilinx SOC-FPGAs having pynq linux.
All source code is made available under a BSD 3-clause license. You can freely use and modify the code, without warranty, so long as you provide attribution
to the authors. See LICENSE.md
for the full license text.
The manuscript text is currently accepted, and will be published soon as a conference paper in the IEEE Xplore Digital Library.
TBD
Inspiration, code snippets, references, etc.