Voice Control System for Robotics in a Noisy Environment

Authors

DOI:

https://doi.org/10.18372/1990-5548.81.19016

Keywords:

speech signals, voice control, adaptive wavelet filtering, mel-frequency cepstral coefficients, mixtures of Gaussian distributions, support vector method, communication channel, nonlinear distortion coefficient

Abstract

This paper analyzes the effectiveness of the developed voice control system for robotics based on MFCC and GMM-SVM under the influence of interference in the communication channel. The system allows characterizing individual features of speech signals with their subsequent classification and making a reliable decision on the interpretation and execution of voice commands by robotic equipment. The proposed voice control system for robotics based on MFCC and GMM-SVM is implemented using the following technologies: 1) selection of active speech areas by calculating the short-term energy and the number of zero crossings between adjacent frames of the speech signal; 2) adaptive wavelet filtering of the speech signal, where it is necessary to generate threshold values, which will reduce the impact of additive noise; 3) selection of recognition features, which are used as mel-frequency cepstral coefficients; 4) classification of recognition features based on mixtures of Gaussian distributions and the support vector method using the linear Campbell kernel and the principal component method with a projection on latent structures, which will reduce errors of the 1st and 2nd kind.

Author Biography

Oleksandr Lavrynenko, National Aviation University, Kyiv

PhD in Engineering

Associate Professor

Department of Telecommunication and Radio Electronic Systems

Faculty of Air Navigation Electronics and Telecommunications

References

P. Liu et al., “Design of Bionic Robot Based on Voice Remote Control,” 2023 42nd Chinese Control Conference (CCC), Tianjin, China, 2023, pp. 4226–4231, https://doi.org/10.23919/CCC58697.2023.10240762

Y. Zhang, C. Chen and C. Yang, “Task Extension of Robot with Voice Control Based on Dynamical Movement Primitives,” 2020 International Symposium on Autonomous Systems (ISAS), Guangzhou, China, 2020, pp. 82–87, https://doi.org/10.1109/ISAS49493.2020.9378861

O. Lavrynenko, G. Konakhovych and D. Bakhtiiarov, “Method of voice control functions of the UAV,” 2016 4th International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), Kiev, Ukraine, 2016, pp. 47–50, https://doi.org/10.1109/MSNMC.2016.7783103

L. Y. Yong, S. Gobee and V. Durairajah, “An Interactive System to Control a Humanoid Robot using Vision and Voice,” 2022 Sixth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Dharan, Nepal, 2022, pp. 895–898, https://doi.org/10.1109/I-SMAC55078.2022.9987307

A. V. Elasarapu, P. Bevara, K. Buramsetty, H. A. Mirza, V. N. Marriwada and N. S. Murthy, “Smart BOT for Face Recognition and Voice Controls,” 2024 International Conference on Computing and Data Science (ICCDS), Chennai, India, 2024, pp. 1–6, https://doi.org/10.1109/ICCDS60734.2024.10560389

O. Lavrynenko, A. Taranenko, I. Machalin, Y. Gabrousenko, I. Terentyeva and D. Bakhtiiarov, “Protected Voice Control System of UAV,” 2019 IEEE 5th International Conference Actual Problems of Unmanned Aerial Vehicles Developments (APUAVD), Kiev, Ukraine, 2019, pp. 295–298, https://doi.org/10.1109/APUAVD47061.2019.8943926

M. Norda, C. Engel, J. Rennies, J. -E. Appell, S. C. Lange and A. Hahn, “Evaluating the Efficiency of Voice Control as Human Machine Interface in Production,” in IEEE Transactions on Automation Science and Engineering, vol. 21, no. 3, pp. 4817–4828, July 2024, https://doi.org/10.1109/TASE.2023.3302951

Y. Ü. Sönmez and A. Varol, “The Necessity of Emotion Recognition from Speech Signals for Natural and Effective Human-Robot Interaction in Society 5.0,” 2022 10th International Symposium on Digital Forensics and Security (ISDFS), Istanbul, Turkey, 2022, pp. 1–8, https://doi.org/10.1109/ISDFS55398.2022.9800837

M. M and V. R S, “A Review on Quality Speech Recognition Under Noisy Environment,” 2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2023, pp. 545–548, https://doi.org/10.1109/ICACCS57279.2023.10112783

C. -Y. Li and N. T. Vu, “Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN,” 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Cartagena, Colombia, 2021, pp. 830–836, https://doi.org/10.1109/ASRU51503.2021.9688310

A. Bhattacharjee et al., “Bangla voice controlled robot for rescue operation in noisy environment,” 2016 IEEE Region 10 Conference (TENCON), Singapore, 2016, pp. 3284–3288, https://doi.org/10.1109/TENCON.2016.7848659

O. Lavrynenko, B. Chumachenko, M. Zaliskyi, S. Chumachenko and D. Bakhtiiarov, “Method of Remote Biometric Identification of a Person by Voice based on Wavelet Packet Transform,” CEUR Workshop Proceedings, 2024, vol. 3654, pp. 150–162.

T. Kim, J. Chang and J. H. Ko, “ADA-VAD: Unpaired Adversarial Domain Adaptation for Noise-Robust Voice Activity Detection,” ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 7327–7331, https://doi.org/10.1109/ICASSP43922.2022.9746755

S. Wen, W. -S. Gan and D. Shi, “An Improved Selective Active Noise Control Algorithm Based on Empirical Wavelet Transform,” ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1633–1637, https://doi.org/10.1109/ICASSP40776.2020.9054452.

O. Lavrynenko, D. Bakhtiiarov, V. Kurushkin, S. Zavhorodnii, V. Antonov and P. Stanko, “A method for extracting the semantic features of speech signal recognition based on empirical wavelet transform,” Radioelectronic and Computer Systems, 2023, vol. 107, no. 3, pp. 101–124. https://doi.org/10.32620/reks.2023.3.09.

M. Bächle, M. Schambach and F. Puente León., “Signal-Adapted Analytic Wavelet Packets in Arbitrary Dimensions,” 2020 28th European Signal Processing Conference (EUSIPCO), Amsterdam, Netherlands, 2021, pp. 2230–2234, https://doi.org/10.23919/Eusipco47968.2020.9287575

M. Joorabchi, S. Ghorshi and Y. Naderahmadian, “Speech Denoising Based on Wavelet Transform and Wiener Filtering,” 2023 8th International Conference on Frontiers of Signal Processing (ICFSP), Corfu, Greece, 2023, pp. 43–46, https://doi.org/10.1109/ICFSP59764.2023.10372899

R. Odarchenko, O. Lavrynenko, D. Bakhtiiarov, S. Dorozhynskyi and V. A. O. Zharova, “Empirical Wavelet Transform in Speech Signal Compression Problems,” 2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&T), Kharkiv, Ukraine, 2021, pp. 599–602, https://doi.org/10.1109/PICST54195.2021.9772156

M. M. Azmy, “Gender of Fetus Identification Using Modified Mel-Frequency Cepstral Coefficients Based on Fractional Discrete Cosine Transform,” in IEEE Access, vol. 12, pp. 48158–48164, 2024, https://doi.org/10.1109/ACCESS.2024.3373430

K. V. Veena and D. Mathew, “Speaker identification and verification of noisy speech using multitaper MFCC and Gaussian Mixture models,” 2015 International Conference on Power, Instrumentation, Control and Computing (PICC), Thrissur, India, 2015, pp. 1–4. https://doi.org/10.1109/PICC.2015.7455806

O. Lavrynenko, A. Pinchuk, H. Martyniuk, A. Fesenko, S. Yarotsky and M. Aleksander, “Remote Voice User Verification System for Access to IoT Services Based on 5G Technologies,” 2023 IEEE 12th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Dortmund, Germany, 2023, pp. 1042–1048, https://doi.org/10.1109/IDAACS58523.2023.10348955

O. Veselska, O. Lavrynenko, R. Odarchenko, M. Zaliskyi, D. Bakhtiiarov, M. Karpinski and S. Rajba, “A Wavelet-Based Steganographic Method for Text Hiding in an Audio Signal,” Sensors, 2022, vol. 22, no. 15, pp. 1–25. https://doi.org/10.3390/s22155832

A. Jovanović, Z. Perić, J. Nikolić and D. Aleksić, “The Effect of Uniform Data Quantization on GMM-based Clustering by Means of EM Algorithm,” 2021 20th International Symposium INFOTEH-JAHORINA (INFOTEH), East Sarajevo, Bosnia and Herzegovina, 2021, pp. 1–5, https://doi.org/10.1109/INFOTEH51037.2021.9400662

O. Lavrynenko, R. Odarchenko, G. Konakhovych, A. Taranenko, D. Bakhtiiarov and T. Dyka, “Method of Semantic Coding of Speech Signals based on Empirical Wavelet Transform,” 2021 IEEE 4th International Conference on Advanced Information and Communication Technologies (AICT), Lviv, Ukraine, 2021, pp. 18–22, https://doi.org/10.1109/AICT52120.2021.9628985

O. Yu. Lavrynenko, D. I. Bakhtiiarov, B. S. Chumachenko, O. G. Holubnychyi, G. F. Konakhovych and V. V. Antonov, “Application of Daubechies wavelet analysis in problems of acoustic detection of UAVs,” CEUR Workshop Proceedings, 2024, vol. 3662, pp. 125–143.

Downloads

Published

2024-09-30

Issue

Section

TELECOMMUNICATIONS AND RADIO ENGINEERING