Skip to content

Commit

Permalink
Add historical credit in manuscript
Browse files Browse the repository at this point in the history
  • Loading branch information
suaefar committed Feb 23, 2021
1 parent e86dd18 commit 2b3fdcb
Showing 1 changed file with 26 additions and 0 deletions.
26 changes: 26 additions & 0 deletions manuscript/ms.tex
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,22 @@ \subsection*{PLATT dynamic range manipulation}
%
Also, fewer frequency channels reduce the need for sharp filters which would require long integration time constants and introduce additional latency.

\cite{bustamante1987} proposed to compress the first two principle components (PC1 and PC2) of the short-term speech spectrum, which were roughly representative of overall level and spectral tilt.
%
With this approach, the frequency bands were not processed independently anymore, and the finer spectral structure was always preserved.
%
Their analysis indicated that the highest intelligibility was obtained when audibility was improved and the relative spectral shapes of different speech sounds were preserved \citep{bustamante1987}.
%
In their concluding section, they recommended to investigate the enhancement of spectral differences while compressing level variations.
%
\cite{levitt1991} proposed an approach which decomposes and manipulates the short-term spectrum using a set of orthogonal polynomial functions with the aim to preserve important speech cues.
%
Referring to the study of \cite{bustamante1987}, \cite{levitt1991} wrote: \emph{\enquote{Both studies showed that compression of the lowest order component (factor 1 in the principal-components method and the constant term in the orthogonal polynomial method, respectively) had by far the largest effect, and that compression of higher order components had little effect, if any.}}
%
The common idea behind these two studies was to linearly map and manipulate the spectral dimension of a suitable spectro-temporal representation with the aim of separating important from less important speech signal dynamic.
%
However, both studies considered only clean speech signals, and hence did not consider the relevant portions of speech signal for their recognition in noise.

That the signal dynamic can be described as the difference of frequency-dependent short-term effective amplitudes, e.g., across time (temporal dynamic), across frequency (spectral dynamic), or both (spectro-temporal dynamic), raises the question which representation is most suitable to manipulate it.
%
ASR systems are \emph{the} technical solution to decode speech signals and hence provide a model for speech recognition.
Expand Down Expand Up @@ -1634,6 +1650,11 @@ \section*{Conclusions}
\newblock Standard audiograms for the IEC 60118-15 measurement procedure.
\newblock {\em Trends in amplification}, 14(2):113--120, \url{https://doi.org/10.1177%2F1084713810379609}

\bibitem[Bustamante and Braida, 1987]{bustamante1987}
Bustamante, D.~K. and Braida, L.~D. (1987)
\newblock Principal-component amplitude compression for the hearing impaired.
\newblock {\em The Journal of the Acoustical Society of America}, 82(4):1227--1242, \url{https://doi.org/10.1121/1.395259}

\bibitem[Dreschler, 1992]{dreschler1992}
Dreschler, W.~A. (1992)
\newblock Fitting multichannel-compression hearing aids.
Expand Down Expand Up @@ -1689,6 +1710,11 @@ \section*{Conclusions}
\newblock Sentence recognition prediction for hearing-impaired listeners in stationary and fluctuation noise with fade: Empowering the attenuation and distortion concept by Plomp with a quantitative processing model.
\newblock {\em Trends in Hearing}, 20, \url{https://doi.org/10.1177%2F2331216516655795}

\bibitem[Levitt and Neuman, 1991]{levitt1991}
Levitt, H. and Neuman, A.~C. (1991)
\newblock Evaluation of orthogonal polynomial compression.
\newblock {\em The Journal of the Acoustical Society of America}, 90(1):241--252, \url{https://doi.org/10.1121/1.401294}

\bibitem[Moore et~al., 1999]{moore1999}
Moore, B.~C.~J., Peters, R.~W., and Stone, M.~A. (1999)
\newblock Benefits of linear amplification and multichannel compression for speech comprehension in backgrounds with spectral and temporal dips.
Expand Down

0 comments on commit 2b3fdcb

Please sign in to comment.