Lombard speech detection in case of spatial separation between noise source and talkers of different genders
- Autores: Andreeva I.G.1, Lunichkin А.М.1, Ogorodnikova Е.А.1,2
-
Afiliações:
- Sechenov Institute of Evolutionary Physiology and Biochemistry of Russian Academy of Sciences
- Pavlov Institute of Physiology, Russian Academy of Sciences
- Edição: Volume 110, Nº 2 (2024)
- Páginas: 185-195
- Seção: EXPERIMENTAL ARTICLES
- URL: https://gynecology.orscience.ru/0869-8139/article/view/651671
- DOI: https://doi.org/10.31857/S0869813924020031
- EDN: https://elibrary.ru/DJSPNE
- ID: 651671
Citar
Resumo
The spatial selectivity of hearing to speech signals was studied when the target signal and interference were separated by distance between their sources and the listener. In the work, the hypothesis about the improvement of hearing selectivity due to more intensive activation of the high-frequency binaural mechanism due to the shift of the speaker’s voice spectrum occurs in noisy environment towards high frequencies, was tested. The thresholds for detecting the target signal – a two-syllable word uttered by male or female, were evaluated in the two-alternative two-interval forced choice paradigm in 4 series. Series differed by the type of target signal (ordinary or Lombard speech) and the location of target source and noise (multi-talker noise) one. The both sources were located at a distance of 1 and 4 m opposite the subject at the level of his head. The detection threshold was defined as the ratio of signal and noise levels at the listener’s place (S/N). The threshold for detecting the target signal (male and female speaker voices together) was -11 dB S/N for ordinary as well as Lombard speech. It did not depend on which of the sources - the target signal or noise, was closer to the listener. In normal speech, the detection thresholds on average differed for male and female voices, but the difference was not significant. In Lombard speech, these thresholds were significantly different: for a male voice, the threshold at a detection level of 0.67 was -10 dB S/N, and for a female voice – -12.5 dB S/N.
Texto integral

Sobre autores
I. Andreeva
Sechenov Institute of Evolutionary Physiology and Biochemistry of Russian Academy of Sciences
Autor responsável pela correspondência
Email: ig-andreeva@mail.ru
Rússia, Saint Petersburg
А. Lunichkin
Sechenov Institute of Evolutionary Physiology and Biochemistry of Russian Academy of Sciences
Email: ig-andreeva@mail.ru
Rússia, Saint Petersburg
Е. Ogorodnikova
Sechenov Institute of Evolutionary Physiology and Biochemistry of Russian Academy of Sciences; Pavlov Institute of Physiology, Russian Academy of Sciences
Email: ig-andreeva@mail.ru
Rússia, Saint Petersburg; Saint Petersburg
Bibliografia
- Bronkhorst AW (2015) The cocktail-party problem revisited: early processing and selection of multi-talker speech. Atten Percept Psychophys 77(5): 1465–1487. https://doi.org/10.3758/s13414-015-0882-9
- Andreeva IG (2018) Spatial selectivity of hearing in speech recognition in speech-shaped noise environment. Hum Physiol 44: 226–236. https://doi.org/10.1134/S0362119718020020
- Andreeva IG, Ogorodnikova EA (2022). Auditory Adaptation to Speech Signal Characteristics. J Evol Biochem Physiol 58(5): 1293–1309. https://doi.org/10.1134/S0022093022050027
- Marrone N, Mason CR, Kidd G (2008) Tuning in the spatial dimension: Evidence from a masked speech identification task. J Acoust Soc Am 124(2): 1146–1158. https://doi.org/10.1121/1.2945710
- Haustein BG (1969) Hypothesen uber die einhorige Entferungswahrnehmung des menschlichen Gehors. Hochfrequensthechnick und Electroakustic 78(2): 45–57.
- Mershon DH, Bowers JN (1979) Absolute and relative cues for the auditory perception of egocentric distance. Perception 8(3): 311–322. https://doi.org/10.1068/p080
- Kolarik AJ, Moore BC, Zahorik P, Cirstea S, Pardhan S (2016) Auditory distance perception in humans: a review of cues, development, neuronal bases, and effects of sensory loss. Atten Percept Psychophys 78: 373–395. https://doi.org/10.3758/s13414-015-1015-1
- Андреева ИГ, Бахтина АВ, Гвоздева АП (2014) Разрешающая способность слуха человека по расстоянию при приближении и удалении источников звука разного спектрального состава. Сенс сист 28(4): 3–12. [Andreeva IG, Bahtina AV, Gvozdeva AP (2014) Human’s hearing resolution in case of localizing of approaching and withdrawing sound images with various spectral structures. Sensory Systems 28(4): 3–12. (In Russ)]. https://www.elibrary.ru/item.asp?id=22741084
- Andreeva IG, Dymnikowa M, Gvozdeva AP, Ogorodnikova EA, Pak SP (2019). Spatial separation benefit for speech detection in multi-talker babble-noise with different egocentric distances. Acta Acust United Acust 105(3): 484–491. https://doi.org/10.3813/AAA.919330
- Огородникова ЕА, Лабутина ОВ, Андреева ИГ, Гвоздева АП, Баулин ЮА (2020) Фактор просодики в восприятии коммуникативной сцены с пространственным разделением источников речи и речеподобной помехи. Лингвистический форум 2020: Язык и искусственный интеллект. Москва, 12–14 ноября: 127–128. [Ogorodnikova EA, Labutina OV, Andreeva IG, Gvozdeva AP, Baulin YA (2020) Faktor prosodiki v vospriyatii kommunikativnoj sceny s prostranstvennym razdeleniem istochnikov rechi i rechepodobnoj pomekhi [The prosody factor in the perception of a communicative scene with spatially separate sources of speech and speech-like interference. Linguistic Forum 2020: Language and artificial intelligence. Moscow, November 12-14: 127–128. (In Russ)].
- Kleczkowski P, Żak A, Król-Nowak A (2017) Lombard effect in Polish speech and its comparison in English speech. Arch Acoust 42(4): 561–569. https://doi.org/10.1515/aoa-2017-0060.
- Brungart DS, Simpson BD, Ericson MA, Scott KR (2001) Informational and energetic masking effects in the perception of multiple simultaneous talkers. J Acoust Soc Am 110(5 Pt 1): 2527–2538. https://doi.org/10.1121/1.1408946.
- Van Ngo T, Kubo R, Morikawa D, Akagi M (2017) Acoustical analyses of tendencies of intelligibility in lombard speech with different background noise levels. J Signal Process Syst 21(4): 171–174. https://doi.org/10.2299/jsp.21.171
- Tang P, Xu Rattanasone N, Yuen I, Demuth K (2017) Phonetic enhancement of Mandarin vowels and tones: Infant-directed speech and Lombard speech. J Acoust Soc Am 142(2): 493–503. https://doi.org/10.1121/1.4995998
- Lu Y, Cooke M (2008) Speech production modifications produced by competing talkers, babble, and stationary noise. J Acoust Soc Am 124(5): 3261–3275. https://doi.org/10.1121/1.2990705
- Lu Y, Cooke M (2009) Speech production modifications produced in the presence of low-pass and high-pass filtered noise. J Acoust Soc Am 126(3): 1495–1499. https://doi.org/10.1121/1.3179668
- Garnier M, Henrich N (2014) Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise? Comput Speech Lang 28(2): 580–597. https://doi.org/10.1016/j.csl.2013.07.005
- Keith RW (2000) Random Gap Detection Test. St Louis: Auditec St Louis 2000.
- Gvozdeva AP, Lunichkin AM, Zaytseva LG, Ogorodnikova EA, Andreeva IG (2022) Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise-a New Statistical Approach. In International Conference on Speech and Computer. Cham: Springer Int Publ. 252–264. https://doi.org/10.1007/978-3-031-20980-2_22
- Strouse A, Ashmead DH, Ohde RN, Grantham DW (1998) Temporal processing in the aging auditory system. J Acoust Soc Am 104(4): 2385–2399. https://doi.org/10.1121/1.423748
- Puts DA, Gaulin SJC, Verdolini K (2006) Dominance and the evolution of sexual dimorphism in human voice pitch. Evol Hum Behav 27(4): 283–296. https://doi.org/10.1016/j.evolhumbehav.2005.11.003
- Stowe LM, Golob EJ (2013) Evidence that the Lombard effect is frequency-specific in humans. J Acoust Soc Am 134(1): 640–647. https://doi.org/ 10.1121/1.4807645
- Bottalico P, Passione II, Graetzer S, Hunter EJ (2017) Evaluation of the starting point of the Lombard effect. Acta Acust United Acust 103(1): 169–172. https://doi.org/10.3813/AAA.919043
- Pohjalainen J, Raitio T, Yrttiaho S, Alku P (2013) Detection of shouted speech in noise: Human and machine. J Acoust Soc Am 133(4): 2377–2389. https://doi.org/10.1121/1.4794394
- Berg M, Fuchs M, Wirkner K, Loeffler M, Engel C, Berger T (2017) The Speaking Voice in the General Population: Normative Data and Associations to Sociodemographic and Lifestyle Factors. J Voice 31(2): 257.e13–257.e24. https://doi.org/10.1016/j.jvoice.2016.06.001
- Шиленкова ВВ, Бестолкова ОС (2013) Пресбифония. Возрастные изменения акустических параметров голоса. Вестн оториноларингол 78(6): 24–27. [Shilenkova VV, Bestolkova OS (2013) Age-related changes in the acoustic voice characteristics. Vestn Otorinolaringol 8(6): 24–27. (In Russ)]. https://www.elibrary.ru/item.asp?id=21074035
Arquivos suplementares
