fast-asd-updated

Run fast-asd on local with gpu / cpu ( It automatically checks to see if your gpu has cuda ), and use it as a python library

Example Usage:

file = path/to/your/file

videotracker = VideoTalkingTracker()

data = videotrack.process(file)

fast-asd

This repository is an optimized, production-ready implementation of active speaker detection. Read more about the research area here.

It contains of two parts:

The open-source implementation of the active speaker detection application that runs on the Sieve platform.
The standalone, optimized implementation of TalkNet, a leading model for active speaker detection.

The TalkNet implementation significantly improve on the original primarily from the perspective of performance. The pre-processing and post-processing steps are faster and it support variable frame-rate videos (not just 25 FPS like the original). The active speaker detection implementation is a further productionized version of this that parallelizes processing through TalkNet and a separate standalone face detection model to provide faster, higher-quality speaker tracking and detection results.

Usage

TalkNet

If you plan to just use the standalone implementation of TalkNet, follow the steps below:

go to the talknet directory
run pip install -r requirements.txt
run python main.py

You can change the input video file being used by modifying the main function in main.py.

Active Speaker Detection

The easiest way to run active speaker detection is to use the version already deployed on the Sieve platform available here.

While the core application can be run locally, it still calls public functions available on Sieve, such at the YOLO object detection model so you will need to sign up for a free account and get an API key. You can do so here.

After you've signed up and run sieve login, you can run main.py from the root directory of this repository to run the active speaker detection application.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.cog		.cog
.github/workflows		.github/workflows
models		models
talknet		talknet
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
custom_types.py		custom_types.py
main.py		main.py
scene_detection.py		scene_detection.py
utils.py		utils.py
yolov8_model.py		yolov8_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fast-asd-updated

fast-asd

Usage

TalkNet

Active Speaker Detection

About

Releases

Packages

Languages

License

AlexJrDevs/fast-asd

Folders and files

Latest commit

History

Repository files navigation

fast-asd-updated

fast-asd

Usage

TalkNet

Active Speaker Detection

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages