Skip to content
Sivasurya Santhanam edited this page Oct 31, 2016 · 1 revision

asrEvaluationToolkit

The main aim of this project is to build an user friendly automatic speech recognition evaluation system, which could evaluate multiple speech recognition engines on a given dataset.

Main window

The main window shows the user to choose between two models:

  • Recognize & Evaluate

  • Performance calculator

The descriptions of both the models are also provided.

Main window

Recognize & Evaluate

Recognise & evaluate component executes each speech recognition system to recognise audio files from the speech database and then evaluates the recognition output with the reference output. This component requires various speech recognition system’s SDK, it’s related models, a speech database consisting of audio files and its respective transcriptions and to perform a complete recognition and evaluation system. Configuration of the models completely depend upon the speech recognition system’s API. Once the recognition output is obtained, the obtained text is aligned with the reference text and result is provided in terms of performance metrics.

model1

Performance calculator

Performance calculator is used, when a speech recognition system’s output and the reference text of the recognised speech files are already available. This component compares both the reference text and hypothesis text by an alignment process and provides result in terms of performance metrics. This is a subset of Recognise & evaluate component.

model2

In both the models for evaluating the output text with the reference text, viterbi alignment algorithm is applied which penalizes for substitution, deletion and insertion of words to the reference text.

Clone this wiki locally