Generic Audio Data Segmentation and Indexing
In the coarse-level segmentation and indexing stage, audio data are segmented and classified into basic audio types, based on morphological and statistical analysis of the temporal curves of the short-time energy function, the short-time average zero-crossing rate, and the short-time fundamental frequency, as well as the spectral peak tracks of audio signals. Threshold-based heuristical rules are derived empirically to guide the classification procedures. Therefore, the approach is completely generic and model-free, which can be applied under any circumstances. An illustration of the scheme is shown in Figure 4.1.
KeywordsFundamental Frequency Environmental Sound Speech Segment Music Background Music Piece
Unable to display preview. Download preview PDF.