Technology

ParaASR - Speech Recognition

Speech recognition, also known as automatic speech recognition (ASR), computer speech recognition, or speech-to-text, is a capability which enables a program to process human speech into a written format.   

Our speech recognition technology is pioneering in the industry. Unlike others who rely on 3rd party services providers, we develop our own and unique speech processing engine, capable of handling precisely Cantonese, Mandarin and English. Our strength is in Cantonese, and our model is being optimized from time to time. 

Research Paper

A study on data augmentation of reverberant speech for robust speech recognition

Author: Tom Ko, Vijayaditya Peddinti, Daniel Povey, Michael L. Seltzer, Sanjeev Khudanpur, March 2017
  ​The environmental robustness of DNN-based acoustic models can be significantly improved by using multi-condition training data. However, as data collection is a costly proposition, simulation of the desired conditions is a frequently adopted strategy. In this paper we detail a data augmentation approach for far-field ASR. We examine the impact of using simulated room impulse responses (RIRs), as real RIRs can be difficult to acquire, and also the effect of adding point-source noises. We find that the performance gap between using simulated and real RIRs can be eliminated when point-source noises are added. Further we show that the trained acoustic models not only perform well in the distant-talking scenario but also provide better results in the close-talking scenario. We evaluate our approach on several LVCSR tasks which can adequately represent both scenarios.

Full paper

Techniques for Noise Robustness in Automatic Speech Recognition

Author: Tuomas Virtanen, Rita Singh, Bhiksha Raj,  October 2012
Automatic speech recognition (ASR) systems are finding increasing use in everyday life. Many of the commonplace environments where the systems are used are noisy, for example users calling up a voice search system from a busy cafeteria or a street. This can result in degraded speech recordings and adversely affect the performance of speech recognition systems.  As the use of ASR systems increases, knowledge of the state-of-the-art in techniques to deal with such problems becomes critical to system and application engineers and researchers who work with or on ASR technologies. This book presents a comprehensive survey of the state-of-the-art in techniques used to improve the robustness of speech recognition systems to these degrading external influences.

Key features:
- Reviews all the main noise robust ASR approaches, including signal separation, voice activity detection, robust feature extraction, model compensation and adaptation, missing data techniques and recognition of reverberant speech.
- Acts as a timely exposition of the topic in light of more widespread use in the future of ASR technology in challenging environments.
- Addresses robustness issues and signal degradation which are both key requirements for practitioners of ASR.
- Includes contributions from top ASR researchers from leading research units in the field

Wiley Online