Accuracy of automatic speech recognition system trained on noised speech

Authors

  • A. Prodeus National Technical University of Ukraine “Igor Sikosky Kyiv Polytechnic Institute”
  • K. Kukharicheva National Technical University of Ukraine “Igor Sikosky Kyiv Polytechnic Institute”

DOI:

https://doi.org/10.18372/1990-5548.49.11230

Keywords:

Automatic speech recognition, speech recognition accuracy, training technique, clean speech, noised speech

Abstract

In this paper two techniques of automatic speech recognition system training on noised speechare compared with technique of training on clean speech. The comparing has been made by means ofspeech recognition accuracy measure, with usage of fourteen kinds of noise. These were noises of householdappliances and computers, street and transport, teaching rooms and lobbies. The superiority degree ofnoised speech training techniques over the competitive technique has been assessed. It is shown thattraining on noised speech allows reaching the 95% recognition accuracy for minimal signal-to-noise ratio10 dB, whereas training on clean speech allows reaching the same recognition accuracy for minimalsignal-to-noise ratio 20 dB

Author Biographies

A. Prodeus, National Technical University of Ukraine “Igor Sikosky Kyiv Polytechnic Institute”

DrSc. Professor. Acoustics and Electroacoustics Department

K. Kukharicheva, National Technical University of Ukraine “Igor Sikosky Kyiv Polytechnic Institute”

Postgraduate student. Acoustics and Electroacoustics Department

References

Researchers fine-tune F-35 pilot-aircraft speech system. Available: https://web.archive.org/web/20071020030310/http://www.af.mil/news/story.asp? id=123071861

E. Craparo, and E. Feron, “Natural Language Processing in the Control of Unmanned Aerial Vehicles”, Proceeding of AIAA Guidance, Navigation, and Control Conference, pp. 1-13, August 2004.

X. Huang, A. Acero, and H.-W.Hon, Spoken Language Processing: a Guide to Theory, Algorithm, and system development. Prentice Hall, Inc., 2001, 965 p.

R.P. Lippmann, E.A. Martin, and D.P. Paul, "Multi-Style Training for Robust Isolated-Word Speech Recognition," Int. Conf. on Acoustics, Speech and Signal Processing, pp. 709-712, 1987, Dallas, TX.

J. Rajnoha, “Multi-Condition Training for Unknown Environment Adaptation in Robust ASR Under Real Conditions,” Acta Polytechnica vol. 49, no. 2–3, pp. 3-7, 2009.

J. Li, L. Deng, Y. Gong, and R. Haeb-Umbach, “An Overview of Noise-Robust Automatic Speech Recognition,” IEEE/ACM Trans. Audio, Speech, and Language Processing, vol. 22, no. 4, pp. 745-777, February 2014.

The HTK Book / Ed. S. Young, G. Evermann, M. Gales. Cambridge: University Engineering Department, 2009, 375 p.

A. Prodeus and V. P. Ovsianyk, “Estimation of late reverberation spectrum: Optimization of parameters,” Radioelectronics and Communications Systems, vol. 58, Is. 7, pp.322-328, July 2015.

V.S. Didkovskyi, S.A. Naida, and O.A. Zubchenko, “Technique for rigidity determination of the materials for ossicles prostheses of human middle ear,” Radioelectronics and Communications Systems, vol. 58, no. 3, pp. 134-138, 2015.

K. Pylypenko and A. Prodeus, “Noise Impact Assessment on the Accuracy of the Determination of Speaker’s Gender by Using Method of the Cumulant Coefficients,” XIth International Conference "Perspective Technologies and Methods in MEMS Design (MEMSTECH 2015), Lviv–Polyana, Ukraine, pp. 102-106, 2-6 September 2015.

Downloads

Issue

Section

THEORY AND METHODS OF SIGNAL PROCESSING