Skip to content

Releases: symblai/speech-recognition-evaluation

v1.1.0

22 Dec 01:33
Compare
Choose a tag to compare

This is a simple utility to perform a quick evaluation on the results generated by any Speech to text (STT) or Automatic Speech Recognition (ASR) System.

This utility can calculate the following metrics -

  • Word Error Rate (WER), which is the most common metric of measuring the performance of a Speech Recognition or Machine translation system
  • Word Information Loss (WIL), which is a simple approximation to the proportion of word information lost. Refer to this paper for more info.
  • Levenshtein Distance calculated at the word level.
  • Number of Word-level insertions, deletions, and mismatches between the original file and the generated file.
  • Number of Phrase level insertions, deletions, and mismatches between the original file and the generated file.
  • Color Highlighted text Comparison to visualize the differences.
  • General Statistics about the original and generated files (bytes, characters, words, new lines, etc.)

The utility also performs the pre-processing or normalization of the text in the provided files based on the following operations -

  • Remove Speaker Name: Remove the Speaker's name at the beginning of the line.
  • Remove Annotations: Remove any custom annotations added during transcriptions.
  • Remove Whitespaces: Remove any extra white spaces.
  • Remove Quotes: Remove any double quotes
  • Remove Dashes: Remove any dashes
  • Remove Punctuations: Remove any punctuations (.,?!)
  • Convert contents to lower case

Support for ability to pass false to boolean parameters with default value true
Example = --lowercase false

Full Changelog: v1.0.0...v1.1.0