-
Notifications
You must be signed in to change notification settings - Fork 0
/
AcousticVoiceQualityIndexv.02.03.txt
395 lines (337 loc) · 12.2 KB
/
AcousticVoiceQualityIndexv.02.03.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
# TITLE OF THE SCRIPT: ACOUSTIC VOICE QUALITY INDEX (AVQI) v.02.03
# Form for introduction and/or parameterization
form Acoustic Voice Quality Index v.02.03
comment >>> It is advocated to estimate someone's dysphonia severity in both
comment continuous speech (i.e., 'cs') and sustained vowel (i.e., 'sv') (Maryn et al.,
comment 2010). This script therefore runs on these two types of recordings, and it
comment important to name these recordings 'cs' and 'sv', respectively.
comment >>> This script automatically (a) searches, extracts and then concatenates
comment the voiced segments of the continuous speech recording to a new sound; (b)
comment concatenates the sustained vowel recording to the new sound, (c) determines
comment the Smoothed Cepstral Peak Prominence, the Shimmer Local, the Shimmer
comment Local dB, the LTAS-slope, the LTAS-tilt and the Harmonics-to-Noise Ratio of
comment the concatenated sound signal, (d) calculates the AVQI-score mostly based
comment on the method of Maryn et al. (2010), and draws the oscillogram, the narrow-
comment band spectrogram with LTAS and the power-cepstrogram with power-
comment cepstrum of the concatenated sound signal to allow further interpretation.
comment >>> For the AVQI to be reliable, it is imperative that the sound recordings
comment are made in an optimal data acquisition conditions.
comment >>> There are two versions in this script: (1) a simple version (only AVQI with
comment data of acoustic measures), and (2) an illustrated version (AVQI with data of
comment acoustic measures and above-mentioned graphs).
choice version: 2
button simple
button illustrated
comment >>> Additional information (optional):
sentence name_patient
sentence left_dates_(birth_-_assessment)
sentence right_dates_(birth_-_assessment)
comment
comment Script credits: Youri Maryn (PhD) and Paul Corthals (PhD)
endform
Erase all
Select inner viewport... 0.5 7.5 0.5 4.5
Axes... 0 1 0 1
Black
Text special... 0.5 centre 0.6 half Helvetica 12 0 Please wait an instant. Depending on the duration and/or the sample rate of the recorded
Text special... 0.5 centre 0.4 half Helvetica 12 0 sound files, this script takes more or less time to process the sound and search for the AVQI.
# --------------------------------------------------------------------------------------------
# PART 0:
# HIGH-PASS FILTERING OF THE SOUND FILES.
# --------------------------------------------------------------------------------------------
select Sound cs
Filter (stop Hann band)... 0 34 0.1
Rename... cs
select Sound sv
Filter (stop Hann band)... 0 34 0.1
Rename... sv
# --------------------------------------------------------------------------------------------
# PART 1:
# DETECTION, EXTRACTION AND CONCATENATION OF
# THE VOICED SEGMENTS IN THE RECORDING
# OF CONTINUOUS SPEECH.
# --------------------------------------------------------------------------------------------
select Sound cs
Copy... original
samplingRate = Get sampling frequency
intermediateSamples = Get sampling period
Create Sound... onlyVoice 0 0.001 'samplingRate' 0
select Sound original
To TextGrid (silences)... 50 0.003 -25 0.1 0.1 silence sounding
select Sound original
plus TextGrid original
Extract intervals where... 1 no "does not contain" silence
Concatenate
select Sound chain
Rename... onlyLoud
globalPower = Get power in air
select TextGrid original
Remove
select Sound onlyLoud
signalEnd = Get end time
windowBorderLeft = Get start time
windowWidth = 0.03
windowBorderRight = windowBorderLeft + windowWidth
globalPower = Get power in air
voicelessThreshold = globalPower*(30/100)
select Sound onlyLoud
extremeRight = signalEnd - windowWidth
while windowBorderRight < extremeRight
Extract part... 'windowBorderLeft' 'windowBorderRight' Rectangular 1.0 no
select Sound onlyLoud_part
partialPower = Get power in air
if partialPower > voicelessThreshold
call checkZeros 0
if (zeroCrossingRate <> undefined) and (zeroCrossingRate < 3000)
select Sound onlyVoice
plus Sound onlyLoud_part
Concatenate
Rename... onlyVoiceNew
select Sound onlyVoice
Remove
select Sound onlyVoiceNew
Rename... onlyVoice
endif
endif
select Sound onlyLoud_part
Remove
windowBorderLeft = windowBorderLeft + 0.03
windowBorderRight = windowBorderLeft + 0.03
select Sound onlyLoud
endwhile
select Sound onlyVoice
procedure checkZeros zeroCrossingRate
start = 0.0025
startZero = Get nearest zero crossing... 'start'
findStart = startZero
findStartZeroPlusOne = startZero + intermediateSamples
startZeroPlusOne = Get nearest zero crossing... 'findStartZeroPlusOne'
zeroCrossings = 0
strips = 0
while (findStart < 0.0275) and (findStart <> undefined)
while startZeroPlusOne = findStart
findStartZeroPlusOne = findStartZeroPlusOne + intermediateSamples
startZeroPlusOne = Get nearest zero crossing... 'findStartZeroPlusOne'
endwhile
afstand = startZeroPlusOne - startZero
strips = strips +1
zeroCrossings = zeroCrossings +1
findStart = startZeroPlusOne
endwhile
zeroCrossingRate = zeroCrossings/afstand
endproc
# --------------------------------------------------------------------------------------------
# PART 2:
# DETERMINATION OF THE SIX ACOUSTIC MEASURES
# AND CALCULATION OF THE ACOUSTIC VOICE QUALITY INDEX.
# --------------------------------------------------------------------------------------------
select Sound sv
durationVowel = Get total duration
durationStart=durationVowel-3
if durationVowel>3
Extract part... durationStart durationVowel rectangular 1 no
Rename... sv2
elsif durationVowel<=3
Copy... sv2
endif
select Sound onlyVoice
durationOnlyVoice = Get total duration
plus Sound sv2
Concatenate
Rename... avqi
durationAll = Get total duration
minimumSPL = Get minimum... 0 0 None
maximumSPL = Get maximum... 0 0 None
# Narrow-band spectrogram and LTAS
To Spectrogram... 0.03 4000 0.002 20 Gaussian
select Sound avqi
To Ltas... 1
minimumSpectrum = Get minimum... 0 4000 None
maximumSpectrum = Get maximum... 0 4000 None
# Power-cepstrogram, Cepstral peak prominence and Smoothed cepstral peak prominence
select Sound avqi
To PowerCepstrogram... 60 0.002 5000 50
cpps = Get CPPS... no 0.01 0.001 60 330 0.05 Parabolic 0.001 0 Straight Robust
To PowerCepstrum (slice)... 0.1
maximumCepstrum = Get peak... 60 330 None
# Slope of the long-term average spectrum
select Sound avqi
To Ltas... 1
slope = Get slope... 0 1000 1000 10000 energy
# Tilt of trendline through the long-term average spectrum
select Ltas avqi
Compute trend line... 1 10000
tilt = Get slope... 0 1000 1000 10000 energy
# Amplitude perturbation measures
select Sound avqi
To PointProcess (periodic, cc)... 50 400
Rename... avqi1
select Sound avqi
plus PointProcess avqi1
percentShimmer = Get shimmer (local)... 0 0 0.0001 0.02 1.3 1.6
shim = percentShimmer*100
shdb = Get shimmer (local_dB)... 0 0 0.0001 0.02 1.3 1.6
# Harmonic-to-noise ratio
select Sound avqi
To Pitch (cc)... 0 75 15 no 0.03 0.45 0.01 0.35 0.14 600
select Sound avqi
plus Pitch avqi
To PointProcess (cc)
Rename... avqi2
select Sound avqi
plus Pitch avqi
plus PointProcess avqi2
voiceReport$ = Voice report... 0 0 75 600 1.3 1.6 0.03 0.45
hnr = extractNumber (voiceReport$, "Mean harmonics-to-noise ratio: ")
# Calculation of the AVQI
avqi = ((3.295-(0.111*cpps)-(0.073*hnr)-(0.213*shim)+(2.789*shdb)-(0.032*slope)+(0.077*tilt))*2.208)+1.797
# --------------------------------------------------------------------------------------------
# PART 3:
# DRAWINGS ALL THE INFORMATION AND THE GRAPHS.
# --------------------------------------------------------------------------------------------
# Title and patient information
Erase all
Solid line
Line width... 1
Black
Helvetica
Select inner viewport... 0 8 0 0.5
Font size... 1
Select inner viewport... 0.5 7.5 0.1 0.15
Axes... 0 1 0 1
Text... 0 Left 0.5 Half Script: Youri Maryn (PhD) and Paul Corthals (PhD)
Font size... 12
Select inner viewport... 0.5 7.5 0 0.5
Axes... 0 1 0 1
Text... 0 Left 0.5 Half ##ACOUSTIC VOICE QUALITY INDEX (AVQI) v.02.03#
Font size... 8
Select inner viewport... 0.5 7.5 0 0.5
Axes... 0 1 0 3
Text... 1 Right 2.3 Half %%'name_patient$'%
Text... 1 Right 1.5 Half %%°'left_dates$'%
Text... 1 Right 0.7 Half %%'right_dates$'%
# Simple version
if version = 1
# Data
Font size... 10
Select inner viewport... 0.5 7.5 0.5 2
Axes... 0 7 6 0
Text... 0.05 Left 0.5 Half Smoothed cepstral peak prominence (CPPS): ##'cpps:2'#
Text... 0.05 Left 1.5 Half Harmonics-to-noise ratio: ##'hnr:2' dB#
Text... 0.05 Left 2.5 Half Shimmer local: ##'shim:2' \% #
Text... 0.05 Left 3.5 Half Shimmer local dB: ##'shdb:2' dB#
Text... 0.05 Left 4.5 Half Slope of LTAS: ##'slope:2' dB#
Text... 0.05 Left 5.5 Half Tilt of trendline through LTAS: ##'tilt:2' dB#
Select inner viewport... 0.5 3.8 0.5 2
Draw inner box
Font size... 7
Arrow size... 1
Select inner viewport... 4 7.5 1.25 2
Axes... 0 10 1 0
Paint rectangle... green 0 2.91 0 1
Paint rectangle... red 2.91 10 0 1
Draw arrow... avqi 1 avqi 0
Draw inner box
Marks top every... 1 1 yes yes no
Font size... 16
Select inner viewport... 4 7.5 0.5 1.15
Axes... 0 1 0 1
Text... 0.5 Centre 0.5 Half AVQI: ##'avqi:2'#
# Copy Praat picture
Select inner viewport... 0.5 7.5 0 2
Copy to clipboard
# Illustrated version
elsif version = 2
# Oscillogram
Font size... 7
Select inner viewport... 0.5 5 0.5 2.0
select Sound avqi
Draw... 0 0 0 0 no Curve
Draw inner box
One mark left... minimumSPL no yes no 'minimumSPL:2'
One mark left... maximumSPL no yes no 'maximumSPL:2'
Text left... no Sound pressure level (Pa)
One mark bottom... 0 no yes no 0.00
One mark bottom... durationOnlyVoice no no yes
One mark bottom... durationAll no yes no 'durationAll:2'
Text bottom... no Time (s)
# Narrow-band spectrogram
Select inner viewport... 0.5 5 2.3 3.8
select Spectrogram avqi
Paint... 0 0 0 4000 100 yes 50 6 0 no
Draw inner box
One mark left... 0 no yes no 0
One mark left... 4000 no yes no 4000
Text left... no Frequency (Hz)
One mark bottom... 0 no yes no 0.00
One mark bottom... durationOnlyVoice no no yes
One mark bottom... durationAll no yes no 'durationAll:2'
Text bottom... no Time (s)
# LTAS
Select inner viewport... 5.4 7.5 2.3 3.8
select Ltas avqi
Draw... 0 4000 minimumSpectrum maximumSpectrum no Curve
Draw inner box
One mark left... minimumSpectrum no yes no 'minimumSpectrum:2'
One mark left... maximumSpectrum no yes no 'maximumSpectrum:2'
Text left... no Sound pressure level (dB/Hz)
One mark bottom... 0 no yes no 0
One mark bottom... 4000 no yes no 4000
Text bottom... no Frequency (Hz)
# Power-cepstrogram
Select inner viewport... 0.5 5 4.1 5.6
select PowerCepstrogram avqi
Paint: 0, 0, 0, 0, 80, "no", 30, 0, "yes"
Draw inner box
One mark left... 0.00303 no yes no 0.003
One mark left... 0.01667 no yes no 0.017
Text left... no Quefrency (s)
One mark bottom... 0 no yes no 0.00
One mark bottom... durationOnlyVoice no no yes
One mark bottom... durationAll no yes no 'durationAll:2'
Text bottom... no Time (s)
# Power-cepstrum
Select inner viewport... 5.4 7.5 4.1 5.6
select PowerCepstrum avqi_0_100
Draw... 0.00303 0.01667 0 0 no
Draw tilt line... 0.00303 0.01667 0 0 0.00303 0.01667 Straight Robust
Draw inner box
One mark left... maximumCepstrum no yes no 'maximumCepstrum:2'
Text left... no Amplitude (dB)
One mark bottom... 0.00303 no yes no 0.003
One mark bottom... 0.01667 no yes no 0.017
Text bottom... no Quefrency (s)
# Data
Font size... 10
Select inner viewport... 0.5 7.5 5.9 7.4
Axes... 0 7 6 0
Text... 0.05 Left 0.5 Half Smoothed cepstral peak prominence (CPPS): ##'cpps:2'#
Text... 0.05 Left 1.5 Half Harmonics-to-noise ratio: ##'hnr:2' dB#
Text... 0.05 Left 2.5 Half Shimmer local: ##'shim:2' \% #
Text... 0.05 Left 3.5 Half Shimmer local dB: ##'shdb:2' dB#
Text... 0.05 Left 4.5 Half Slope of LTAS: ##'slope:2' dB#
Text... 0.05 Left 5.5 Half Tilt of trendline through LTAS: ##'tilt:2' dB#
Select inner viewport... 0.5 3.8 5.9 7.4
Draw inner box
Font size... 7
Arrow size... 1
Select inner viewport... 4 7.5 6.75 7.4
Axes... 0 10 1 0
Paint rectangle... green 0 2.91 0 1
Paint rectangle... red 2.91 10 0 1
Draw arrow... avqi 1 avqi 0
Draw inner box
Marks top every... 1 1 yes yes no
Font size... 16
Select inner viewport... 4 7.5 5.9 6.65
Axes... 0 1 0 1
Text... 0.5 Centre 0.5 Half AVQI: ##'avqi:2'#
# Copy Praat picture
Select inner viewport... 0.5 7.5 0 7.4
Copy to clipboard
endif
# Remove intermediate objects
select all
minus Sound cs
minus Sound sv
Remove