TY - JOUR
T1 - Effects of Artificial Synthetic Speech Control of SNR and Speech Rate on the Intelligibility of Train Station Announcements
AU - Maruoka, Mizuki
AU - Tsujimura, Sohei
AU - Asakura, Takumi
N1 - Publisher Copyright:
© The Author(s) 2023.
PY - 2024/3
Y1 - 2024/3
N2 - An experimental study on the effect of the speech characteristics of the signal-to-noise ratio (SNR) and speech rate on the intelligibility of announcements at railway stations was conducted using an artificial synthetic voice. Synthesized speech has recently been used in noisy environments both indoors and outdoors, but unlike its use in quiet environments, when the environment is noisy, the intelligibility of announcements may be reduced. For railway station announcements, while natural spoken voices are currently used for multilingual announcements and disaster response broadcasts, deep neural network synthesized voices, which use deep learning, have also been adopted. However, the effect of the acoustic characteristics such as the SNR and speech rate on the intelligibility of reproduced announcements in noisy public spaces such as railway stations has not yet been clarified from a practical viewpoint. In this paper, in order to determine the appropriate SNR and speech rate for synthetic voice announcements in railway stations, auditory impressions of announcements with varying SNR and speech rate were evaluated by participants using a five-point scale. Based on the evaluations, the appropriate conditions for the broadcast of synthetic voice announcements at the ticket gate and on the platform of a station are discussed.
AB - An experimental study on the effect of the speech characteristics of the signal-to-noise ratio (SNR) and speech rate on the intelligibility of announcements at railway stations was conducted using an artificial synthetic voice. Synthesized speech has recently been used in noisy environments both indoors and outdoors, but unlike its use in quiet environments, when the environment is noisy, the intelligibility of announcements may be reduced. For railway station announcements, while natural spoken voices are currently used for multilingual announcements and disaster response broadcasts, deep neural network synthesized voices, which use deep learning, have also been adopted. However, the effect of the acoustic characteristics such as the SNR and speech rate on the intelligibility of reproduced announcements in noisy public spaces such as railway stations has not yet been clarified from a practical viewpoint. In this paper, in order to determine the appropriate SNR and speech rate for synthetic voice announcements in railway stations, auditory impressions of announcements with varying SNR and speech rate were evaluated by participants using a five-point scale. Based on the evaluations, the appropriate conditions for the broadcast of synthetic voice announcements at the ticket gate and on the platform of a station are discussed.
KW - Intelligibility of announcement
KW - Signal-to-noise ratio (SNR)
KW - Sound environment of railway station
KW - Speech rate
KW - Subjective evaluation
KW - Synthetic voice
UR - http://www.scopus.com/inward/record.url?scp=85173039445&partnerID=8YFLogxK
U2 - 10.1007/s40857-023-00306-8
DO - 10.1007/s40857-023-00306-8
M3 - Article
AN - SCOPUS:85173039445
SN - 0814-6039
VL - 52
SP - 77
EP - 86
JO - Acoustics Australia
JF - Acoustics Australia
IS - 1
ER -