Skip to content

Heuristics

Santiago Barreda edited this page May 13, 2021 · 4 revisions

Heuristics have been added to represent general tendencies that good formant tracks tend to follow. The heuristics can greatly increase the accuracy of the tracker. However, they can also make the analysis incorrectly if used in a situation where they shouldn't be!

Heuristics are implemented in a very primitive manner: if a rule is broken, a penalty of 10000 is added to the error for the track. This takes the analysis out of contention unless all analyses are penalized, and makes it easy to see which analyses were penalized for violating a heuristic.

The heuristics are:

  • maximum F1 frequency: Median F1 frequency should not be higher than 1200 Hz. In my experience it is unusual for F1 frequencies to be higher than this, unless the speaker is very small and/or probucing an exceptionally 'open' vowel. This limit can be changed as necessary.

  • maximum F2 bandwidth: median F2 bandwidth should not be higher than 500 Hz. In good recording conditions most F1 values will have pretty narrow bandwidths. This heuristic may cause problems with noisy data and should probably be set relative to the bandwidths you observe in your own data from measurements that you trust.

  • maximum F3 bandwidth: median F3 bandwidth should not be higher than 600 Hz. Same caveats as the F1 bandwidth heuristic.

  • minimum F4 frequency: It is extremely unusual to have an average F4 below 2900 Hz. Even Scottie Pippen's F4 is higher than this (listen to his voice).

  • rhotic heuristic: if F3 < 2000 Hz, F1 and F2 should be at least 500 Hz apart. A lot of tracking errors feature a very low F3 and a very low F2. However, in most real rhotics F2 tends to be higher near F3 than it is to F1. So, this heuristic only accepts low F3 vowels when F1 and F2 are separated by a reasonable amount.

  • minimum F3-F4 difference: if (F4 - F3) < 500 Hz, F1 and F2 should be at least 1500 Hz apart. When F3 and F4 are really close together, this will usually be a high fron vowel. This requires that F1 and F2 be reasonably separated.