Skip to main content

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

Abstract

Semantic texton forests (stfs) are a form of random decision forest that can be employed to produce powerful low-level codewords for computer vision. Each decision tree acts directly on image pixels, resulting in a codebook that bypasses the expensive computation of filter-bank responses or local descriptors. Further, stfs are extremely fast to both train and test, especially when compared with k-means clustering and nearest-neighbor assignment of feature descriptors. The nodes in the stfs provide both an implicit hierarchical clustering into semantic textons, and also an explicit pixel-wise local classification estimate. In this chapter we (i) investigate stfs as learned visual dictionaries; (ii) show how stfs can be used for both image categorization and semantic segmentation by aggregating hierarchical bags of semantic textons; (iii) demonstrate that stfs allow us to exploit semantic context in segmentation; and (iv) show how a global image-level categorization can be used as a prior to improve the accuracy of semantic segmentation. We also see that the efficient tree structures of stfs allow at least a five-fold increase in execution speed over competing techniques.

This work was undertaken while the first two authors were at the University of Cambridge and Toshiba Corporate Research and Development Center respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    At training time, we compute and store the distributions p j (c) for all nodes j in the tree, not just for leaf nodes.

  2. 2.

    This effect may be due to segmentation forest (b) being over-confident: looking at the five most likely classes inferred for each pixel, (b) achieves 87.6 % while (d) achieves a better 88.0 %.

References

  1. Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9(7)

    Google Scholar 

  2. Bosch A, Zisermann A, Muñoz X (2007) Image classification using random forests and ferns. In: Proc IEEE intl conf on computer vision (ICCV)

    Google Scholar 

  3. Breiman L (2001) Random forests. Mach Learn 45(1)

    Google Scholar 

  4. Chum O, Zisserman A (2007) An exemplar model for learning object classes. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  5. Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV intl workshop on statistical learning in computer vision

    Google Scholar 

  6. Elkan C (2003) Using the triangle inequality to accelerate k-means. In: Proc intl conf on machine learning (ICML)

    Google Scholar 

  7. Everingham M, van Gool L, Williams C, Winn J, Zisserman A (2007) The Pascal visual object classes (VOC) challenge. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

  8. Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  9. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 36(1)

    Google Scholar 

  10. Grauman K, Darrell T (2005) The pyramid match kernel: discriminative classification with sets of image features. In: Proc IEEE intl conf on computer vision (ICCV)

    Google Scholar 

  11. He X, Zemel RS, Carreira-Perpiñán MÁ (2004) Multiscale conditional random fields for image labeling. In: Proc IEEE conf computer vision and pattern recognition (CVPR), June 2004, vol 2

    Google Scholar 

  12. Julesz B (1981) Textons, the elements of texture perception, and their interactions. Nature 290(5802)

    Google Scholar 

  13. Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: Proc IEEE intl conf on computer vision (ICCV), vol 1

    Google Scholar 

  14. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  15. Lepetit V, Lagger P, Fua P (2005) Randomized trees for real-time keypoint recognition. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  16. Li L-J, Fei-Fei L (2007) What, where and who? Classifying events by scene and object recognition. In: Proc IEEE intl conf on computer vision (ICCV)

    Google Scholar 

  17. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2)

    Google Scholar 

  18. Malik J, Belongie S, Leung T, Shi J (2001) Contour and texture analysis for image segmentation. Int J Comput Vis 43(1)

    Google Scholar 

  19. Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. Int J Comput Vis 60(1)

    Google Scholar 

  20. Moosmann F, Triggs B, Jurie F (2006) Fast discriminative visual codebooks using randomized clustering forests. In: Advances in neural information processing systems (NIPS)

    Google Scholar 

  21. Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  22. Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: Proc European conf on computer vision (ECCV). Springer, Berlin

    Google Scholar 

  23. Oliva A, Torralba A (2006) Building the gist of a scene: the role of global image features in recognition. Vis Percept Prog Brain Res 155(1)

    Google Scholar 

  24. Porikli FM (2005) Integral histogram: a fast way to extract histograms in Cartesian spaces. In: Proc IEEE conf computer vision and pattern recognition (CVPR), vol 1

    Google Scholar 

  25. Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S (2007) Objects in context. In: Proc IEEE intl conf on computer vision (ICCV)

    Google Scholar 

  26. Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: Proc IEEE conf computer vision and pattern recognition (CVPR), Minneapolis, June 2007

    Google Scholar 

  27. Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image categorization and segmentation. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  28. Shotton J, Winn JM, Rother C, Criminisi A (2009) TextonBoost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81(1)

    Google Scholar 

  29. Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proc IEEE intl conf on computer vision (ICCV)

    Google Scholar 

  30. Tu Z, Bai X (2010) Auto-context and its application to high-level vision tasks and 3D brain image segmentation. IEEE Trans Pattern Anal Mach Intell 32(10)

    Google Scholar 

  31. Tuytelaars T, Schmid C (2007) Vector quantizing feature space with a regular lattice. In: Proc IEEE intl conf on computer vision (ICCV)

    Google Scholar 

  32. Varma M, Zisserman A (2005) A statistical approach to texture classification from single images. Int J Comput Vis 62(1–2)

    Google Scholar 

  33. Verbeek J, Triggs B (2007) Region classification with Markov field aspect models. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  34. Winder S, Brown M (2007) Learning local image descriptors. In: Proc IEEE conf computer vision and pattern recognition (CVPR)

    Google Scholar 

  35. Winn J, Criminisi A, Minka T (2005) Categorization by learned universal visual dictionary. In: Proc IEEE intl conf on computer vision (ICCV), Beijing, China, October 2005, vol 2

    Google Scholar 

  36. Zhang J, Marszałek M, Lazebnik S, Schmid C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73(2)

    Google Scholar 

Download references

Acknowledgements

We would like to thank J. Winn, B. Wenger, O. Yamaguchi, and V. Viitaniemi for helpful conversations and insights contributing to the work in this paper.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Johnson, M., Shotton, J., Cipolla, R. (2013). Semantic Texton Forests for Image Categorization and Segmentation. In: Criminisi, A., Shotton, J. (eds) Decision Forests for Computer Vision and Medical Image Analysis. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-4929-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4929-3_15

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4928-6

  • Online ISBN: 978-1-4471-4929-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics