Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Italian Lipsync #54

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Italian Lipsync #54

wants to merge 4 commits into from

Conversation

lupettohf
Copy link

Italian Lipsync preprocessor :D

Italian Lipsync preprocessor
@met4citizen
Copy link
Owner

Great. Thank you for sharing!

I tried the class in my test environment, but I couldn't get it to work. It always ended up in an infinite loop. I also tried calling the methods directly, for example, wordsToVisemes("Cappello"), and it resulted in an infinite loop as well.

I think the reason for this is that for each letter, the last rule should be the letter itself and the most common viseme. For example, for "A", the last rule should be "[A]=aa" and so on. Without this default rule, the process can start to repeat itself, resulting in an infinite loop. Since the rules are applied in order, exceptions to the most common visemes should be defined first.

As this problem occurs almost every time, I encourage you to ensure that in your setup, you are actually using this lip-sync module and not one of the existing modules.

Additionally, in the rules, there are now a lot of viseme names such as bb, v, j etc. that are not valid Oculus OVR viseme codes. Please check the valid codes in README's Appendix C.

I don't speak Italian, and I know quite little about Italian phonology, but if you have time to work on these issues, I can help you test the module. Italian is a phonetically orthographic language, and in that regard, it resembles Finnish.

Now properly termitates the loop.
@lupettohf
Copy link
Author

Thanks for the clarification, indeed I was using the default en processor. Here is a test with the new one:

It preprocessor:
cappello-it

@met4citizen
Copy link
Owner

That looks promising, and I also got the class working in my test environment. Here is a short (unlisted) video clip I recorded: https://youtu.be/fw17X7cmvx8

The lip-sync accuracy is not too bad, but the rules still need some adjustment. This is often the trickiest part. The rules don't have to be perfect, of course, just enough to maintain the illusion.

First, you should check that all those right hand side visemes in your rules are indeed valid Oculus visemes, which are: 'aa', 'E', 'I', 'O', 'U', 'PP', 'SS', 'TH', 'CH', 'FF', 'kk', 'nn', 'RR', 'DD', 'sil'.

One issue that I noticed in your screenshot is the duration your rule set gives to the viseme aa. It seems too long. The reason is most likely the rule "[C]A=kk aa". This is what probably happens: At first, the pointer is at the first letter C (Cappello, the capital letter indicating where the pointer is). As this matches the rule's left-hand side (CA), everything inside the square brackets (C) is skipped and replaced with the right-hand side visemes (kk aa), and the pointer is moved to the remaining part (cAppello). During the next iteration, the A is converted to viseme aa. This means that the CA actually becomes kk aa aa. Double visemes get combined in the code, and the result is one long viseme aa, which is, I believe, not right. This same issue applies to some other rules, too. If you review those cases, you can actually remove a lot of the rules and at the same time improve the lip-sync accuracy.

The rule logic might seem a bit confusing at first, but you can always refer to the original 1976 US Navy paper.

A good way to verify your rule set is to download some open-source Italian phoneme dictionary, convert its phonemes to visemes, and then run each word through your class and compare the results. This would give you a percentage indicating how accurate your rule set is.

@lupettohf
Copy link
Author

Yesterday I did some live tests and they where acceptable, but yes, some words works better than others. I took a look at the demo you made, the lipsync in that case was almost spot on. I'll keep working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants