Visual Distinctiveness Detection of Pedestrian based on Statistically Weighting PLSA for Intelligent Systems

  • Hyun Chul Song
  • Gyun Hyuk Lee
  • Duk-Sun Shim
  • Kwang Nam Choi
Regular Paper Robot and Applications
  • 5 Downloads

Abstract

Intelligent Systems for autonomous vehicles including drone, robot vision, and video surveillance, need to distinguish pedestrian from other object. Pedestrian detection is an essential and significant research topic due to its diverse applications. In this paper, a new visual distinctiveness detection method for pedestrian is proposed based on the statistically weighting probabilistic latent semantic analysis. We detect the distinctiveness by integrating three steps as follows: first representing the co-ocurrence matrix of images, which were vectorized using the bag of visual words (BoVW) framework; then calculating the weights through the histograms of visual words of each class; and finally applying the weights to the test images as the distinctiveness of visual words. The probabilistic latent semantic analysis (PLSA) was used as classification method in our system. We extracted the weighted visual words by sampling the patches from the current image. The proposed method was compared to the PLSA using the Caltech 256 datasets. The classes used include pedestrians, cars, motorbikes, airplanes and horses. The results of the experiment show that the proposed method outperforms current methods in predicting pedestrians and transportation objects.

Keywords

Bag of visual words intelligent systems pedestrian detection probabilistic latent semantic analysis weighting scheme 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    S. Zhang, R. Benenson, M. Omran, J. Hosang, and B. Schiele, “How far are we from solving pedestrian detection?” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, pp. 1259–1267, 2016.Google Scholar
  2. [2]
    R. Benenson, M. Omran, J. Hosang, and B. Schiele, “Ten years of pedestrian detection, what have we learned?” Proc. of the European Conf. Computer Vision, pp. 613–627, 2014.Google Scholar
  3. [3]
    H. Han, Q. Han, X. Li, and J. Gu, “Hierarchical spatial pyramid max pooling based on sift features and sparse coding for image classification,” IET Computer Vision, vol. 7, no. 2, pp. 144–150, 2013. [click]CrossRefGoogle Scholar
  4. [4]
    Z. Ji, “Decoupling sparse coding with fusion of fisher vectors and scalable svms for large-scale visual recognition,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, pp. 450–457, 2013.Google Scholar
  5. [5]
    O. M. Parkhi, A. Vedaldi, A. Zisserman, and C. Jawahar, “Cats and dogs,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, pp. 3498–3505, 2012.Google Scholar
  6. [6]
    R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” Proc. of the IEEE Conf. Computer Vision, vol. 2, pp. 1816–1823, 2005.CrossRefGoogle Scholar
  7. [7]
    D. G. Lowe, “Object recognition from local scale-invariant features,” Proc. of the IEEE Conf. Computer Vision, vol. 2, pp. 1150–1157, 1999.Google Scholar
  8. [8]
    P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. I–I, 2001.Google Scholar
  9. [9]
    T. Q. Bui, T. T. Vu, and K.-S. Hong, “Extraction of sparse features of color images in recognizing objects,” International Journal of Control, Automation and Systems, vol. 14, no. 2, pp. 616–627, 2016. [click]CrossRefGoogle Scholar
  10. [10]
    J. Kim, G. H. Lee, J. J. Jung, and K. N. Choi, “Real-time head pose estimation framework for mobile devices,” Mobile Networks and Applications, vol. 22, no. 4, pp. 634–641, 2017. [click]CrossRefGoogle Scholar
  11. [11]
    S. H. Chang, D.-S. Shim, H.-Y. Kim, and K.-N. Choi, “Object motion tracking using a moving direction estimate and color updates,” International Journal of Control, Automation and Systems, vol. 10, no. 1, pp. 136–142, 2012. [click]CrossRefGoogle Scholar
  12. [12]
    N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 886–893, 2005. [click]Google Scholar
  13. [13]
    H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robust features,” Proc. of the European Conf. Computer Vision, pp. 404–417, 2006. [click]Google Scholar
  14. [14]
    Y. Mu, S. Yan, Y. Liu, T. Huang, and B. Zhou, “Discriminative local binary patterns for human detection in personal album,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, pp. 1–8, 2008. [click]Google Scholar
  15. [15]
    T. Hofmann, “Probabilistic latent semantic indexing,” Proc. of the International ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 50–57, 1999. [click]Google Scholar
  16. [16]
    D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of Machine Learning Research, vol. 3, no. Jan, pp. 993–1022, 2003.MATHGoogle Scholar
  17. [17]
    C. Zhong and Z. Miao, “Modeling correlation between multi-modal continuous words for plsa-based video classification,” Proc. of the International Conf. Image Processing, pp. 4304–4308, 2014.Google Scholar
  18. [18]
    K. Pliakos and C. Kotropoulos, “Plsa driven image annotation, classification, and tourism recommendation,” Proc. of the International Conf. Image Processing, pp. 3003–3007, 2014.Google Scholar
  19. [19]
    L. Duan, C. Wu, J. Miao, L. Qing, and Y. Fu, “Visual saliency detection by spatially weighted dissimilarity,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, pp. 473–480, 2011.Google Scholar
  20. [20]
    P. Pinoli, D. Chicco, and M. Masseroli, “Enhanced probabilistic latent semantic analysis with weighting schemes to predict genomic annotations,” Proc. of the IEEE Conf. Bioinformatics and Bioengineering, pp. 1–4, 2013.Google Scholar
  21. [21]
    L. Xie, J. Wang, B. Zhang, and Q. Tian, “Fine-grained image search,” IEEE Transactions on Multimedia, vol. 17, no. 5, pp. 636–647, 2015.CrossRefGoogle Scholar
  22. [22]
    R. Fergus, “Visual object category recognition,” 2005.Google Scholar
  23. [23]
    H. J. Choi, Y. S. Lee, D. S. Shim, C. G. Lee, and K.N. Choi, “Effective pedestrian detection using deformable part model based on human model,” International Journal of Control, Automation and Systems, vol. 14, no. 6, pp. 1618–1625, 2016. [click]CrossRefGoogle Scholar
  24. [24]
    J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman, “Discovering object categories in image collections,” Proc. of the IEEE Conf. Computer Vision, 2005.Google Scholar
  25. [25]
    A. Bosch, A. Zisserman, and X. Muñoz, “Scene classification via plsa,” Proc. of the European Conf. Computer Vision, pp. 517–530, 2006.Google Scholar
  26. [26]
    L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories,” Computer vision and Image understanding, vol. 106, no. 1, pp. 59–70, 2007. [click]CrossRefGoogle Scholar
  27. [27]
    G. Griffin, A. Holub, and P. Perona, “Caltech-256 object category dataset,” California Institute of Technology, 2007.Google Scholar
  28. [28]
    C. Harris and M. Stephens, “A combined corner and edge detector.” Alvey Vision Conference, vol. 15, no. 50, pp. 10–5244, 1988.Google Scholar
  29. [29]
    D. G. Lowe, “Local feature view clustering for 3d object recognition,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. I–I, 2001.Google Scholar
  30. [30]
    G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags of keypoints,” Proc. of the European Conf. Computer Vision, vol. 1, no. 1-22, pp. 1–2, 2004.Google Scholar
  31. [31]
    L. Fei-Fei and P. Perona, “A bayesian hierarchical model for learning natural scene categories,” Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 524–531, 2005.Google Scholar
  32. [32]
    J. Sivic and A. Zisserman, “Video google: A text retrieval approach to object matching in videos.” Proc. of the IEEE Conf. Computer Vision, vol. 2, no. 1470, pp. 1470–1477, 2003.CrossRefGoogle Scholar
  33. [33]
    S. Deerwester, S. T. Dumais, G.W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” Journal of the American society for Information Science, vol. 41, no. 6, p. 391, 1990.CrossRefGoogle Scholar
  34. [34]
    T. K. Landauer, P. W. Foltz, and D. Laham, “An introduction to latent semantic analysis,” Discourse Processes, vol. 25, no. 2-3, pp. 259–284, 1998. [click]CrossRefGoogle Scholar
  35. [35]
    G. Overett, L. Petersson, N. Brewer, L. Andersson, and N. Pettersson, “A new pedestrian dataset for supervised learning,” Proc. of the IEEE Conf. Intelligent Vehicles Symposium, pp. 373–378, 2008.Google Scholar
  36. [36]
    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikitlearn: machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.MathSciNetMATHGoogle Scholar

Copyright information

© Institute of Control, Robotics and Systems and The Korean Institute of Electrical Engineers and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Hyun Chul Song
    • 1
  • Gyun Hyuk Lee
    • 1
  • Duk-Sun Shim
    • 2
  • Kwang Nam Choi
    • 1
  1. 1.School of Computer Science and EngineeringChung-Ang UniversitySeoulKorea
  2. 2.School of Electrical and Electronics EngineeringChung-Ang UniversitySeoulKorea

Personalised recommendations