Skip to content

This was created as a part of the course - 11785 Introduction to Deep Learning at Carnegie Mellon University

Notifications You must be signed in to change notification settings

HariniS2506/Face2Voice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Face2Voice

This was created as a part of the course - 11785 Introduction to Deep Learning at Carnegie Mellon University

Abstract

Previous studies have demonstrated a significant statistical correlation between a person's facial structure and their voice. This correlation is attributed to both direct and indirect factors. Directly, the skeletal structure of the face determines the acoustic properties of the vocal tract responsible for voice production. Indirectly, environmental factors that affect facial development may also affect the voice. Furthermore, demographic factors such as age, gender, and ethnicity are also shown to have an impact on both facial structure and voice.

Standing on the back of above research, our project proposes algorithms to generate the voice of a person using their facial imagery. However, the relationship between voices and faces must be learned. To solve this, we aim to extract facial features that would have an impact on the voice, map them to corresponding effects it would have on the voice quality (pitch, timbre). We then generate a potential voice. Further, we propose to apply style transfer to the generated voice to imitate style of speaking (pauses, accent) as the person in question but in the voice of the person whose voice we want to recreate.

About

This was created as a part of the course - 11785 Introduction to Deep Learning at Carnegie Mellon University

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •