Multimedia Tools and Applications

, Volume 31, Issue 3, pp 249–267 | Cite as

Active learning in very large databases

  • Navneet Panda
  • King-Shy Goh
  • Edward Y. Chang


Query-by-example and query-by-keyword both suffer from the problem of “aliasing,” meaning that example-images and keywords potentially have variable interpretations or multiple semantics. For discerning which semantic is appropriate for a given query, we have established that combining active learning with kernel methods is a very effective approach. In this work, we first examine active-learning strategies, and then focus on addressing the challenges of two scalability issues: scalability in concept complexity and in dataset size. We present remedies, explain limitations, and discuss future directions that research might take.


Active learning Image retrieval Relevance feedback Support vector machines 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blum A, Mitchell T (1998) Combining labeled and unlabeled data wih co-training. In: Proceedings of the workshop on computational learning theory, Madison, Wisconsin, 92–100Google Scholar
  2. 2.
    Brinker K (2003) Incorporating diversity in active learning with support vector machines. In: Prooceedings of the twentieth international conference on machine learning, Washington, District of Columbia, 59–66Google Scholar
  3. 3.
    Chang E, Goh K, Sychay G, Wu G (2003a) CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans Circuits Syst Video Technol 13(1):26–38 (Special issue on conceptual and dynamic aspects of multimedia content description)CrossRefGoogle Scholar
  4. 4.
    Chang E, Li B (2003) MEGA—the maximizing expected generalization algorithm for learning complex query concepts. ACM Trans Inf. Sys. 21(4):347–382MathSciNetCrossRefGoogle Scholar
  5. 5.
    Chang E, Li B, Wu G, Goh K-S (2003b) Statistical learning for effective visual information retrieval. In: IEEE Conference in Image Processing, Barcelona, Spain, 606–612Google Scholar
  6. 6.
    Flickner M, Sawhney H, Ashley J, Huang Q, Dom B, Gorkani M, Hafner J, Lee D, Petkovic D, Steele D, Yanker P (1995) Query by image and video content: the QBIC system. IEEE Computer 28(9):23–32Google Scholar
  7. 7.
    Goh K, Chang EY, Lai W-C (2004) Concept-dependent multimodal active learning for image retrieval. In: ACM international conference on multimedia, New York, New York, 564–571Google Scholar
  8. 8.
    Li B, Chang, E (2003) Discovery of a perceptual distance function for measuring image similarity. ACM Multimedia J. 8(6):512–522 (Special issue on content-based image retrieval)CrossRefGoogle Scholar
  9. 9.
    Li C, Chang E, Garcia-Molina H, Wiederhold G (2002) Clustering for approximate similarity queries in high-dimensional spaces. IEEE Trans Knowl Data Eng. 14(4):792–808CrossRefGoogle Scholar
  10. 10.
    Panda N, Chang E (2005) Exploiting geometry for support vector machine indexing. In: SIAM conference on data mining, Newport Beach, CaliforniaGoogle Scholar
  11. 11.
    Tong S, Chang E (2001) Support vector machine active learning for image retrieval. In: Proceedings of ACM international conference on multimedia, Ottawa, Canada, 107–118Google Scholar
  12. 12.
    Tong S, Koller D (2000) Support vector machine active learning with applications to text classification. In: Proceedings of the 17th international conference on machine learning, Stanford, USA, 401–412Google Scholar
  13. 13.
    Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin Heidelberg New YorkMATHGoogle Scholar
  14. 14.
    Zhang Z, Wu G, Wang G, Chang E (2005) Bayesian kernel regression. In: International conference on machine learning, Bonn, GermanyGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2006

Authors and Affiliations

  1. 1.University of CaliforniaSanta BarbaraUSA

Personalised recommendations