ASSESSMENT OF CLIPPED SPEECH QUALITY

A. M. Prodeus, I. V. Kotvytskyi, A. A. Ditiashov

Abstract


Clipping of speech leads to the appearance of higher orders harmonics and, as a result, to reducing of accuracy of automatic speech recognition systems used as kind of artificial intellect of aircraft control systems and flight control systems for unmanned aerial vehicles. In this paper, the subjective and objective estimates of clipped speech quality are presented. It was shown that subjective speech quality Degradation Mean Opinion Score scale values are about 4.5, 3.5 and 2.5 for degrees of clipping 5 dB, 10 dB and 15 dB, respectively. Establishing of this rule allows find the boundary permissible degree of clipping based on certain requirements to the speech quality. Dependencies of objective speech quality measures such as segmental signal-to-noise ratio, frequency weighted segmental signal-to-noise ratio, log-spectral distortion, bark-spectral distortion and perceptual evaluation of speech quality on the clipping degree are obtained. It was shown also that kurtosis can be used as clipped speech quality measure. Calculations of correlation coefficients and matching maps which establish relationship between objective and subjective speech quality measures have been made. Obtained results allow concluding that objective speech quality measures can be applied to evaluate both the clipped speech quality and the degree of speech signals clipping.

Keywords


Сlipped speech signal quality; objective measure; subjective measure; matching map; correlation coefficient.

References


T. E. Riemer, M. S. Weiss, and M. W. Losh, "Discrete clipping detection by use of a signal matched exponentially weighted differentiator," Proc. of IEEE Southeastcon. New Orleans, USA, pp. 245–248, 1990. DOI: 10.1109/SECON.1990.117809.

S. Godsill, P. Rayner, and O. Cappe, “Digital Audio Restoration,” Applications of Digital Signal Processing to Audio and Acoustics, (Kahrs, M., Brandenburg, K., Eds.), Kluwer Academic Publishers, Massachusetts, 2001, pp. 133–194.

X. Liu, J. Jia, and L. Cai, "SNR estimation for clipped audio based on amplitude distribution," Proc. of the 9th Int. Conf. on Natural Computation (ICNC), July 2013. DOI: 10.1109/ICNC.2013.6818205.

S. Aleinik and Yu. Matveev, "Detection of Clipped Fragments in Speech Signals," World Academy of Science, Engineering and Technology International Journal of Computer and Information Engineering, vol. 8, no. 2, pp. 286–292, 2014.

F. Bie, D. Wang, J. Wang, and T. F. Zheng, "Detection and reconstruction of clipped speech for speaker recognition," Speech Communication, vol. 72, pp. 218–231, September 2015. DOI: 10.1016/j.specom.2015.06.008.

C. Laguna and A. Lerch, "An efficient algorithm for clipping detection and declipping audio," AES 141st Convention, 2016, September 29–October 2, Los Angeles, USA.

A. Poorjam, J. Jensen, M. Little, and M. Christensen, "Dominant Distortion Classification for Pre-Processing of Vowels in Remote Biomedical Voice Analysis," INTERSPEECH 2017, August 20–24, 2017, Stockholm, Sweden. DOI: 10.21437/Interspeech.2017-378.

Y. Hu and P. Loizou, “Evaluation of objective quality measures for speech enhancement,” IEEE Transactions on Speech and Audio Processing, 16(1), 2008, pp. 229–238.

J. Ma, Y. Hu, and P. Loizou, “Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions,” J. Acoust. Soc. Am., vol. 125, no. 5, pp. 3387–3405, May 2009.

K. Zamsha, B. Lozynskiy, J. Mytiay, E. Stepanovskaya, and A. Prodeus, "Objective and subjective assessment of bandlimited signaling speech quality," Electronics and Communications, vol. 21, no. 1(90), pp. 18–26, 2016.

A. Prodeus, V. Didkovskyi, M. Didkovska, and I. Kotvytskyi, "On Peculiarities of Evaluating the Quality of Speech and Music Signals Subjected to Phase Distortion," Proc. of IEEE 37th Int. Conf. on Electronics and Nanotechnology (ELNANO), April 18-20, 2017, Kyiv, Ukraine, pp. 455–460.

N. Cote, Integral and diagnostic intrusive prediction of speech. Springer-Verlag: Berlin, Heidelberg, 2011.

J. Hansen and B. Pellom, "An effective quality evaluation protocol for speech enhancement algorithms," Proc. Int. Conf. Spoken Lang. Process., vol. 7, 1998, pp. 2819–2822.

A. Prodeus and I. Kotvytskyi, “On Reliability of Log-Spectral Distortion Measure in Speech Quality Estimation,” Proceedings of IEEE 5th International Conference Actual Problems of Unmanned Aerial Vehicles Developments (APUAVD), 17 to 19 October 2017, Kyiv, Ukraine. DOI: 10.1109/APUAVD.2017.8308790.

Recommendation P.862. Series P: Telephone transmission quality, telephone installations, local line networks. Methods for objective and subjective assessment of quality. Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, 2001.

Perceptual Evaluation of Speech Quality (PESQ) ITU-T Recommendations p. 862, p. 862.1, p. 862.2, Version 2.0, October 2005.

S. Naida and O. Pavlenko, “Coupled Circuits Model in Objective Audiometry,” Proceedings of IEEE 38th International Conference on Electronics and Nanotechnology (ELNANO), April 24-26, 2018, Kyiv, Ukraine, pp. 281–286.

S. Naida, "Acoustic Theory Problems of Speech Production in the Light of the Discovery of the Formula for the Middle Ear Norm Parameter," Proc. of IEEE 35th Int. Sc. Conf. Electronics and Nanotechnology (ELNANO), April 21-24, 2015, Kyiv, Ukraine, pp. 347–350.

R. D. De Roo, S. Misra, and C. S. Ruf, “Sensitivity of the Kurtosis Statistic as a Detector of Pulsed Sinusoidal RFI,” IEEE Trans. Geosci.Rem. Sens., July 2007, vol. 45, no. 7.

Zhiqiang Liang Jianming Wei, Junyu Zhao, Haitao Liu, Baoqing Li, Jie Shen and Chunlei Zheng, "The Statistical Meaning of Kurtosis and Its New Application to Identification of Persons Based on Seismic Signals," Sensors, pp.5106 5119, no.8, 2008. DOI:10.3390/s8085106.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.