Robust Radio Broadcast Monitoring Using a Multi-Band Spectral Entropy Signature
Monitoring media broadcast content has deserved a lot of attention lately from both academy and industry due to the technical challenge involved and its economic importance (e.g. in advertising). The problem pose a unique challenge from the pattern recognition point of view because a very high recognition rate is needed under non ideal conditions. The problem consist in comparing a small audio sequence (the commercial ad) with a large audio stream (the broadcast) searching for matches.
In this paper we present a solution with the Multi-Band Spectral Entropy Signature (MBSES) which is very robust to degradations commonly found on amplitude modulated (AM) radio. Using the MBSES we obtained perfect recall (all audio ads occurrences were accurately found with no false positives) in 95 hours of audio from five different am radio broadcasts. Our system is able to scan one hour of audio in 40 seconds if the audio is already fingerprinted (e.g. with a separated slave computer), and it totaled five minutes per hour including the fingerprint extraction using a single core off the shelf desktop computer with no parallelization.
KeywordsGraphic Processing Unit Recognition Rate Lossy Compression High Recognition Rate Audio Stream
- 2.Nakamura, T., Tachibana, R., Kobayashi, S.: Automatic music monitoring and boundary detection for broadcast using audio watermarking. In: SPIE, pp. 170–180 (2002)Google Scholar
- 3.Sigurdsson, S., Petersen, K.B., Lehn-Schioler, T.: Mel frequency cepstral coefficients: An evaluation of robustness of mp3 encoded music. In: International Symposium on Music Information Retrieval, ISMIR (2006)Google Scholar
- 4.Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval, ISMIR (October 2000)Google Scholar
- 5.Herre, J., Allamanche, E., Hellmuth, O.: Robust matching of audio signals using spectral flatness features. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 127–130 (2001)Google Scholar
- 6.Hellman, R.P.: Asymmetry of masking between noise and tone. Perception and Psychophysics 11, 241–246 (1972)Google Scholar
- 7.Pauws, S.: Musical key extraction from audio. In: International Symposium on Music Information Retrieval ISMIR, October 2004, pp. 96–99 (2004)Google Scholar
- 8.Cano, P., Battle, E., Kalker, T., Haitsma, J.: A review of algorithms for audio fingerprinting. In: IEEE Workshop on Multimedia Signal Processing, pp. 169–167 (2002)Google Scholar
- 12.Hellmuth, O., Allamanche, E., Cremer, M., Kastner, T., Neubauer, C., Schmidt, S., Siebenhaar, F.: Content-based broadcast monitoring using mpeg-7 audio fingerprints. In: International Symposium on Music Information Retrieval ISMIR (2001)Google Scholar
- 13.Group, M.A.: Text of ISO/IEC Final Draft International Standard 15938-4 Information Technology - Multimedia Content Description Interface - Part 4: Audio (July 2001)Google Scholar
- 14.Camarena-Ibarrola, J.A.: Identificación Automática de Señales de Audio. PhD thesis, Universidad Michoacana de San Nicolás de Hidalgo (January 2008)Google Scholar