Skip to content

KawhiZhao/Visually-Assisted-Self-supervised-Audio-Speaker-Localization-and-Tracking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visually Assisted Self-supervised Audio Speaker Localization and Tracking

Pre-request

python 3.6, pytorch 1.7

Dataset

AV16.3 dataset: https://zenodo.org/record/4449274#.YrQ6v-yZPJ8

Feature Extraction

DSFD: https://github.com/Tencent/FaceDetection-DSFD

pytorch-segmentation: https://github.com/yassouali/pytorch-segmentation

calculating gccphat: https://github.com/smartcameras/AV3T/tree/master/gcf

Data Preparation

make sure to obtain the segmentation results for every image

python gccphat.py

Training and Evaluation

python train.py

Releases

No releases published

Packages

No packages published

Languages