Music Outlier Detection Using Multiple Sequence Alignment and Independent Ensembles

  • Dimitrios BountouridisEmail author
  • Hendrik Vincent Koops
  • Frans Wiering
  • Remco C. Veltkamp
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9939)


The automated retrieval of related music documents, such as cover songs or folk melodies belonging to the same tune, has been an important task in the field of Music Information Retrieval (MIR). Yet outlier detection, the process of identifying those documents that deviate significantly from the norm, has remained a rather unexplored topic. Pairwise comparison of music sequences (e.g. chord transcriptions, melodies), from which outlier detection can potentially emerge, has been always in the center of MIR research but the connection has remained uninvestigated. In this paper we firstly argue that for the analysis of musical collections of sequential data, outlier detection can benefit immensely from the advantages of Multiple Sequence Alignment (MSA). We show that certain MSA-based similarity methods can better separate inliers and outliers than the typical similarity based on pairwise comparisons. Secondly, aiming towards an unsupervised outlier detection method that is data-driven and robust enough to be generalizable across different music datasets, we show that ensemble approaches using an entropy-based diversity measure can outperform supervised alternatives.


Multiple Sequence Alignment Outlier Detection Pairwise Alignment Adjust Rand Index Profile Hide Markov Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bertin-Mahieux, T., Ellis, D.P., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Society for Music Information Retrieval Conference, pp. 591–596 (2011)Google Scholar
  2. 2.
    Bountouridis, D., Van Balen, J.: The cover song variation dataset. In: The International Workshop on Folk Music Analysis (2014)Google Scholar
  3. 3.
    Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. Proc. VLDB Endow. 2(1), 550–561 (2009)CrossRefGoogle Scholar
  4. 4.
    Eddy, S.R.: Profile hidden Markov models. Bioinformatics 14(9), 755–763 (1998)CrossRefGoogle Scholar
  5. 5.
    Eddy, S.R.: Accelerated profile HMM searches. PLoS Comput. Biol. 7(10), e1002195 (2011)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Aggarwal, C.C.: Outlier analysis. In: Aggarwal, C.C. (ed.) Data Mining, pp. 237–263. Springer, New York (2015)Google Scholar
  7. 7.
    Flexer, A., Pampalk, E., Widmer, G.: Novelty detection based on spectral similarity of songs. In: ISMIR, pp. 260–263 (2005)Google Scholar
  8. 8.
    Flexer, A., Schnitzer, D.: Using mutual proximity for novelty detection in audio music similarity. In: Proceedings of 6th International Workshop on Machine Learning and Music (MML), pp. 31–34. Citeseer (2013)Google Scholar
  9. 9.
    Freitas, C.O.A., Carvalho, J.M., Oliveira, J.J., Aires, S.B.K., Sabourin, R.: Confusion matrix disagreement for multiple classifiers. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 387–396. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-76725-1_41 CrossRefGoogle Scholar
  10. 10.
    Greene, D., Tsymbal, A., Bolshakova, N., Cunningham, P.: Ensemble clustering in medical diagnostics. In: 17th IEEE Symposium on Computer-Based Medical Systems, CBMS 2004, Proceedings, pp. 576–581. IEEE (2004)Google Scholar
  11. 11.
    Grubbs, F.E.: Sample criteria for testing outlying observations. Ann. Math. Stat. 21, 27–58 (1950)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Hadjitodorov, S.T., Kuncheva, L.I., Todorova, L.P.: Moderate diversity for better cluster ensembles. Inf. Fusion 7(3), 264–275 (2006)CrossRefGoogle Scholar
  13. 13.
    Hansen, L.K., L.-Schioler, T., Petersen, K.B., Arenas-Garcia, J., Larsen, J., Jensen, S.H.: Learning and clean-up in a large scale music database. In: 2007 15th European Signal Processing Conference, pp. 946–950. IEEE (2007)Google Scholar
  14. 14.
    Hawkins, D.M.: Identification of Outliers, vol. 11. Springer, Netherlands (1980)CrossRefzbMATHGoogle Scholar
  15. 15.
    Jehl, P., Sievers, F., Higgins, D.G.: OD-seq: outlier detection in multiple sequence alignments. BMC Bioinf. 16(1), 269 (2015)CrossRefGoogle Scholar
  16. 16.
    Livshin, A., Rodet, X.: Purging musical instrument sample databases using automatic musical instrument recognition methods. IEEE Trans. Audio Speech Lang. Process. 17(5), 1046–1051 (2009)CrossRefGoogle Scholar
  17. 17.
    Lukashevich, H., Dittmar, C.: Improving GMM classifiers by preliminary one-class svm outlier detection: application to automatic music mood estimation. In: Locarek-Junge, H., Weihs, C. (eds.) Classification as a Tool for Research, pp. 775–782. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  18. 18.
    Macrae, R., Dixon, S.: Guitar tab mining, analysis and ranking. In: ISMIR, pp. 453–458 (2011)Google Scholar
  19. 19.
    Markou, M., Singh, S.: Novelty detection: a reviewpart 1: statistical approaches. Signal Process. 83(12), 2481–2497 (2003)CrossRefzbMATHGoogle Scholar
  20. 20.
    Panteli, M., Benetos, E., Dixon, S.: Automatic detection of outliers in world music collections. In: Fourth International Conference on Analytical Approaches to World Music (AAWM 2016) (2016)Google Scholar
  21. 21.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)CrossRefGoogle Scholar
  22. 22.
    Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4(4), 406–425 (1987)Google Scholar
  23. 23.
    Zimek, A., Campello, J.G.B., Sander, J.: Ensembles for unsupervised outlier detection: challenges and research questions a position paper. ACM SIGKDD Explor. Newsl. 15(1), 11–22 (2014)CrossRefGoogle Scholar
  24. 24.
    Gómez, E., Klapuri, A., Meudic, B.: Melody description and extraction in the context of music content processing. J. New Music Res. 32(1), 23–40 (2003)CrossRefGoogle Scholar
  25. 25.
    Katoh, K., Misawa, K., Kuma, K.-I., Miyata, T.: MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res. 30(14), 3059–3066 (2002)CrossRefGoogle Scholar
  26. 26.
    Krumhansl, C.L., Kessler, E.J.: Tracing the dynamic changes in perceived tonal organization in a spatial representation of musical keys. Psychol. Rev. 89(4), 334 (1982)CrossRefGoogle Scholar
  27. 27.
    Li, S.Z.: Content-based audio classification and retrieval using the nearest feature line method. Speech Audio Process. 8(5), 619–625 (2000)CrossRefGoogle Scholar
  28. 28.
    Malt, B.C.: An on-line investigation of prototype and exemplar strategies in classification. J. Exp. Psychol. Learn. Mem. Cogn. 15(4), 539 (1989)CrossRefGoogle Scholar
  29. 29.
    Martin, B., Brown, D.G., Hanna, P., Ferraro, P.: Blast for audio sequences alignment: a fast scalable cover identification. In: 13th International Society for Music Information Retrieval Conference, p. 529 (2012)Google Scholar
  30. 30.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)CrossRefGoogle Scholar
  31. 31.
    Sankoff, D., Kruskal, J.B.: Time warps, string edits, and macromolecules: the theory and practice of sequence comparison. Addison-Wesley Publishing Company, Reading (1983)zbMATHGoogle Scholar
  32. 32.
    van Kranenburg, P., de Bruin, M., Grijp, L., Wiering, F.: The shs-50 tune collections. In: Shs-50 Online Reports (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Dimitrios Bountouridis
    • 1
    Email author
  • Hendrik Vincent Koops
    • 1
  • Frans Wiering
    • 1
  • Remco C. Veltkamp
    • 1
  1. 1.Department of Information and Computing SciencesUtrecht UniversityUtrechtNetherlands

Personalised recommendations