Face2Voice

This was created as a part of the course - 11785 Introduction to Deep Learning at Carnegie Mellon University

Abstract

Previous studies have demonstrated a significant statistical correlation between a person's facial structure and their voice. This correlation is attributed to both direct and indirect factors. Directly, the skeletal structure of the face determines the acoustic properties of the vocal tract responsible for voice production. Indirectly, environmental factors that affect facial development may also affect the voice. Furthermore, demographic factors such as age, gender, and ethnicity are also shown to have an impact on both facial structure and voice.

Standing on the back of above research, our project proposes algorithms to generate the voice of a person using their facial imagery. However, the relationship between voices and faces must be learned. To solve this, we aim to extract facial features that would have an impact on the voice, map them to corresponding effects it would have on the voice quality (pitch, timbre). We then generate a potential voice. Further, we propose to apply style transfer to the generated voice to imitate style of speaking (pauses, accent) as the person in question but in the voice of the person whose voice we want to recreate.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
Clustering		Clustering
Evaluation		Evaluation
Inference		Inference
TTS		TTS
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Face2Voice

Abstract

About

Releases

Packages

Contributors 4

Languages

HariniS2506/Face2Voice

Folders and files

Latest commit

History

Repository files navigation

Face2Voice

Abstract

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages