A curated list of different papers and datasets in various areas of audio-visual processing
-
Updated
Jan 30, 2024
A curated list of different papers and datasets in various areas of audio-visual processing
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Implementation of "EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition, ICCV, 2019" in PyTorch
An audio visualizer for React. Provides separate components to visualize both live audio and audio blobs.
Human Emotion Understanding using multimodal dataset.
🎙 Generator waveform paths for SVG 🎶
Libvisual Audio Visualization
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
Programmatic minimalistic audio visualizations.
Efficient synchronization from sparse cues
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)
Transformer-based online speech recognition system with TensorFlow 2
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Add a description, image, and links to the audio-visual topic page so that developers can more easily learn about it.
To associate your repository with the audio-visual topic, visit your repo's landing page and select "manage topics."