ENDPOINT DETECTION IN SPEECH SIGNAL USING ENTROPY AND ITS STATISTICAL PROPERTIES

In speech recognition preprocessing, an important process is the extraction of the speech section from the audio signal and the detection of the endpoint. Many methods are used for this purpose, such as zero-crossing rate, short-time energy, etc. But the presence of noise affects the effectiveness of these methods, since random noise signals, in addition to decreasing the signal-to-noise ratio, can affect the zero-crossing rate. If speech recognition does not occur in a laboratory or studio environment, then there are always random noise sources in the surrounding space. Therefore, in this article, using the entropy of the speech signal and its statistics, we carried out an improved method for detecting the endpoint for reliable selection of the speech section in the signal in the presence of noise. Comparative analyzes for the above methods of detecting the endpoint of a word in a speech signal are carried out. The experiment was carried out for two words of the Mongolian language «As» and «Budeg», which were previously recorded in a computer with the wav extension.

Authors: B. Zandan, О. Bukhtsooj, Т. Galbaatar, A. G. Chensky

Direction: Informatics, Computer Technologies And Control

Keywords: Speech recognition, endpoint detection, zero crossing rate, short-time energy, entropy, central limit theorem


View full article