Linear prediction coefficients correction method for digital speech processing systems with data compression based on the autoregressive model of a voice signal

封面

如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅或者付费存取

详细

The problem of distortion of the autoregressive model of the voice signal under the influence of additive background noise in digital speech processing systems with data compression based on linear prediction is considered. In the frequency domain, these distortions are observed in the weakening of the main formants responsible for the intelligibility of the speaker’s speech. To compensate for formant attenuation, it is proposed to modify the parameters of the autoregressive model (linear prediction coefficients) using the impulse response of a recursive shaping filter. Along with the amplitude amplification of the formants, their frequencies remain unchanged to make the speaker’s voice recognizable. The effectiveness of the method was studied experimentally using specially developed software. Based on the experimental results, conclusions were drawn about a significant increase in the relative level of formants in the power spectrum of the corrected voice signal.

全文:

受限制的访问

作者简介

V. Savchenko

Editorial office of the journal “Radio Engineering and Electronics”

编辑信件的主要联系方式.
Email: vvsavchenko@yandex.ru
俄罗斯联邦, Mokhovaya St., 11, bldg. 7, Moscow, 125009

L. Savchenko

National Research University Higher School of Economics

Email: vvsavchenko@yandex.ru
俄罗斯联邦, B. Pecherskaya St., 25, Nizhny Novgorod, 603155

参考

  1. Rabiner L.R., Schafer R.W. // Foundations and Trends in Signal Processing. 2007. V. 1. № 1–2. P. 1. https://doi.org/10.1561/2000000001
  2. O’Shaughnessy D. // J. Audio. Speech. Music Processing. 2023. V. 8. https://doi.org/10.1186/s13636-023-00274-x
  3. Savchenko V.V. // Radioelectron. Commun. Systems. 2021. V. 64. № 11. P. 592. https://doi.org/10.3103/S0735272721110030
  4. Gibson J. // Information. 2019. V. 10. № 5. 179. https://doi.org/10.3390/info10050179
  5. Chaouch H., Merazka F., Marthon Ph. // Speech Commun. 2019. V. 108. P. 33. https://doi.org/10.1016/j.specom.2019.02.002.
  6. Савченко В.В., Савченко Л.В. // Измерит. техника. 2019. № 9. С. 59. https://doi.org/10.32446/0368-1025it.2019-9-59-64
  7. Candan Ç. // Signal Processing. 2020. V. 166. № 10. Р. 107256. https://doi.org/10.1016/j.sigpro.2019.107256
  8. Semenov V.Yu. // J. Automation and Inform. Sci. 2019. V. 51. № 2. P. 30. https://doi.org/10.1615/JAutomatInfScien.v51.i2.40
  9. Marple S.L. Digital Spectral Analysis with Applications. 2-nd ed. Mineola: Dover Publ., 2019.
  10. Burg J.P. Maximum entropy spectral analysis. PhD Thesis. Stanford Univ., 1975.
  11. Magi C., Pohjalainen J., Bäckström T., Alku P. // Speech Commun. 2009. V. 51. № 5. P. 401. https://doi.org/10.1016/j.specom.2008.12.005
  12. Rout J.K., Pradhan G. // Speech Commun. 2022. V. 144. P. 101. https://doi.org/10.1016/j.specom.2022.09.004
  13. Deng F., Bao Ch. // Speech Commun. 2016. V. 79. P. 30. https://doi.org/10.1016/j.specom.2016.02.006
  14. Савченко В.В., Савченко А. В. // Измерит. техника. 2020. № 11. С. 65. https://doi.org/10.32446/0368-1025it.2020-11-65-72
  15. Савченко В.В. // РЭ. 2023. Т. 68. № 2. С. 138. https://doi.org/10.31857/S0033849423020122
  16. Kathiresan Th., Maurer D., Suter H., Dellwo V. // J. Acoust. Soc. Amer. 2018. V. 143. № 3. P. 1919. https://doi.org/10.1121/1.5036258
  17. Ngo Th., Kubo R., Akagi M. // Speech Commun. 2021. V. 135. P. 11. https://doi.org/10.1016/j.specom.2021.09.004
  18. Palaparthi A., Titze I. R. // Speech Commun. 2020. V. 123. P. 98. https://doi.org/10.1016/j.specom.2020.07.003
  19. Sadasivan J., Seelamantula Ch.S., Muraka N.R. // Speech Commun. 2020. V. 116. P. 12. https://doi.org/10.1016/j.specom.2019.11.001
  20. Gustafsson Ph.U., Laukka P., Lindholm T. // Speech Commun. 2023. V. 146. P. 82. https://doi.org/10.1016/j.specom.2022.12.001
  21. Ito M., Ohara K., Ito A., Yano M. // Proc. Interspeech. 2010. V. 2490. https://doi.org/10.21437/Interspeech.2010-669
  22. Arun-Sankar M.S., Sathidevi P. S. // Heliyon. 2019. V. 5. № 5. Р. e01820. https://doi.org/10.1016/j.heliyon.2019.e01820
  23. Narendra N.P., Alku P. // Speech Commun. 2019. V. 110. P. 47. https://doi.org/10.1016/j.specom.2019.04.003
  24. Alku P., Kadiri S.R., Gowda D. // Computer Speech & Language. 2023. V. 81. № 10. Р. 101515. https://doi.org/10.1016/j.csl.2023.101515
  25. Sadok S., Leglaive S., Girin L. et al. // Speech Commun. 2023. V. 148. P. 53. https://doi.org/10.1016/j.specom.2023.02.005
  26. Nguyen D.D., Chacon A., Payten Ch.L. et al. // Int. J. Language & Commun. Disorders. 2022. V. 57. № 2. P. 366. https://doi.org/10.1111/1460-6984.12705

补充文件

附件文件
动作
1. JATS XML
2. Fig. 1. Estimation of the envelope of the SPM (3) signal of the vowel phoneme “a” with the SNR q2 equal to 0 (1), 10 (2) and 20 dB (3).

下载 (56KB)
3. Fig. 2. Estimates of the KLP of the phoneme “a” signal with the q2 SNR equal to 0 (1), 10 (2) and 20 dB (3) in comparison with the KLP vector in the absence of noise (dotted line).

下载 (67KB)
4. Fig. 3. Pulse response (5) of the forming filter (4) at SNR q2 equal to 0 (1), 10 (2) and 20 dB (3).

下载 (206KB)
5. Fig. 4. Corrected impulse response (6) at c = 0.01 (1), 0.03 (2) and 0.05 (3) for the case of equal SNR q2 = 0 dB in comparison with the impulse response (5) in the absence of correction (dotted line).

下载 (190KB)
6. Fig. 5. The envelope of the SPM (3) of the synthesized voice signal at c = 0.01 (1), 0.03 (2) and 0.05 (3) for the case of equal SNR q2 = 0 dB and in the absence of correction (dotted line).

下载 (43KB)
7. Fig. 6. Fragments of the synthesized signal of the vowel phoneme “a" in c = 0.01 (1), 0.03 (2) and 0.05 (3) for the case of equal SNR q2 = 0 dB and in the absence of correction (dotted line).

下载 (134KB)
8. 7. Schuster periodogram (10) of the vowel phoneme “a” signal synthesized according to the AR model (2) at c = -0.06 (solid curve) and c = 0 (dotted line).

下载 (167KB)
9. 8. Schuster periodogram (10) of the signal of the fricative sound of speech “w" synthesized according to the AR model (2) at c = 0.06 (solid curve) and c = 0 (dotted line).

下载 (169KB)

版权所有 © Russian Academy of Sciences, 2024