简体中文 | English
- 🔥🔥🔥 2021 CCF BDCI Figure Skating Action Recognition Competition with 100 thousand bonus is in progress ! 🎉 PaddleVideo provides baseline model ST-GCN, and related tutorials AI Studio projects, video course.
PaddleVideo is a toolset for video recognition, action localization, and spatio temporal action detection tasks prepared for the industry and academia. This repository provides examples and best practice guildelines for exploring deep learning algorithm in the scene of video area. We devote to support experiments and utilities which can significantly reduce the "time to deploy". By the way, this is also a proficiency verification and implementation of the newest PaddlePaddle 2.0 in the video field.
-
Various dataset and models PaddleVideo supports more datasets and models, including Kinetics400, UCF101, YoutTube8M, NTU-RGB+D datasets, and video recognition models, such as TSN, TSM, SlowFast, TimeSformer, AttentionLSTM, ST-GCN and action localization model, like BMN.
-
Higher performance PaddleVideo has built-in solutions to improve accuracy on recognition models. PP-TSM, which is based on the standard TSM, already archive the best performance in the 2D recognition network, has the same size of parameters but improve the Top1 Acc to 76.16%.
-
Faster training strategy PaddleVideo suppors faster training strategy, such as AMP training, Distributed training, Multigrid method for Slowfast, OP fusion method, Faster reader and so on.
-
Deployable PaddleVideo is powered by the Paddle Inference. There is no need to convert the model to ONNX format when deploying it, all you want can be found in this repository.
-
Applications PaddleVideo provides some interesting and practical projects that are implemented using video recognition and detection techniques, such as FootballAction and VideoTag.
Field | Model | Dataset | Metrics | ACC% |
---|---|---|---|---|
action recognition | PP-TSM | Kinetics-400 | Top-1 | 76.16 |
action recognition | PP-TSN | Kinetics-400 | Top-1 | 75.06 |
action recognition | AGCN | FSD | Top-1 | 62.29 |
action recognition | ST-GCN | FSD | Top-1 | 59.07 |
action recognition | TimeSformer | Kinetics-400 | Top-1 | 77.29 |
action recognition | SlowFast | Kinetics-400 | Top-1 | 75.84 |
action recognition | TSM | Kinetics-400 | Top-1 | 71.06 |
action recognition | TSN | Kinetics-400 | Top-1 | 69.81 |
action recognition | AttentionLSTM | Youtube-8M | Hit@1 | 89.05 |
action detection | BMN | ActivityNet | AUC | 67.23 |
release/2.1 was released in 20/05/2021. Please refer to release notes for details.
- Scan the QR code below with your Wechat and reply "video", you can access to official technical exchange group. Look forward to your participation.
- VideoTag: 3k Large-Scale video classification model
- FootballAction: Football action detection model
- Quick Start
- Model zoo
- recognition
- Localization
- Skeleton-based action recognition
- Spatio temporal action detection
- Coming Soon!
- ActBERT: Learning Global-Local Video-Text Representations
- Coming Soon!
- Tutorials and Slides
- Practice
- Others
- Contribute
PaddleVideo is released under the Apache 2.0 license.
This poject welcomes contributions and suggestions. Please see our contribution guidelines.
- Many thanks to mohui37, zephyr-fun, voipchina for contributing the code.