Skip to main content

A Bag-of-Features Algorithm for Applications Using a NoSQL Database

  • Conference paper
  • First Online:
Information and Software Technologies (ICIST 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 639))

Included in the following conference series:

Abstract

In this paper we present a Bag-of-Words (also known as a Bag-of-Features) method developed for the use of its implementation in NoSQL databases. When working with this algorithm special attention was brought to facilitating its implementation and reducing the number of computations to a minimum so as to use what the database engine has to offer to its maximum. The algorithm is presented using an example of image storing and retrieving. In this case it proves necessary to use an additional step of preprocessing, during which image characteristic features are retrieved and to use a clustering algorithm in order to create a dictionary. We present our own k-means algorithm which automatically selects the number of clusters. This algorithm does not comprise any computationally complicated classification algorithms, but it uses the majority vote method. This makes it possible to significantly simplify computations and use the Javascript language used in a common NoSQL database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  3. Fritzke, B.: Growing grid a self-organizing network with constant neighbourhood range and adaptation strength. Neural Process. Lett. 2(5), 9–13 (1995)

    Article  Google Scholar 

  4. Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1−22 (2004)

    Google Scholar 

  5. Liu, J.: Image retrieval based on bag-of-words model. CoRR abs/1304.5168 (2013). http://arxiv.org/abs/1304.5168

  6. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169−2178 (2006)

    Google Scholar 

  7. Li, W., Dong, P., Xiao, B., Zhou, L.: Object recognition based on the region of interest and optimal bag of words model. Neurocomputing 172, 271–280 (2016)

    Article  Google Scholar 

  8. Nanni, L., Melucci M.: Combination of projectors, standard texture descriptors and bag of features for classifying images. Neurocomputing 173(P3), 1602–1614 (2016)

    Google Scholar 

  9. Gao, H., Dou, L., Chen, W., Sun, J.: Image classification with bag-of-words model based on improved sift algorithm. In: 2013 9th Asian Control Conference (ASCC), pp. 1−6 (2013)

    Google Scholar 

  10. Zhao, C., Li, X., Cang, Y.: Bisecting k-means clustering based face recognition using block-based bag of words model. Optik – Int. J. Light Electron Opt. 126(19), 1761–1766 (2015)

    Article  Google Scholar 

  11. Audet, S.: JavaCV (2016). http://bytedeco.org/. Accessed 1 Apr 2016

  12. Bradski, G.: The OpenCV Library. Dr. Dobb’s Journal of Software Tools (2000)

    Google Scholar 

  13. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: 2004 Conference on Computer Vision and Pattern Recognition Workshop, CVPRW 2004, p. 178, June 2004

    Google Scholar 

  14. Cpalka, K.: A new method for design and reduction of neuro-fuzzy classification systems. IEEE Trans. Neural Netw. 20(4), 701–714 (2009)

    Article  Google Scholar 

  15. Starczewski, J.T.: Centroid of triangular and gaussian type-2 fuzzy sets. Inf. Sci. 280, 289–306 (2014)

    Article  MathSciNet  Google Scholar 

  16. Nowak, B.A., Nowicki, R.K., Starczewski, J.T., Marvuglia, A.: The learning of neuro-fuzzy classifier with fuzzy rough sets for imprecise datasets. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014, Part I. LNCS, vol. 8467, pp. 256–266. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  17. Nowicki, R.: Rough sets in the neuro-fuzzy architectures based on monotonic fuzzy implications. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 510–517. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  18. Sakurai, S., Nishizawa, M.: A new approach for discovering top-k sequential patterns based on the variety of items. J. Artif. Intell. Soft Comput. Res. 5(2), 141–153 (2015)

    Article  Google Scholar 

  19. Tambouratzis, T., Souliou, D., Chalikias, M., Gregoriades, A.: Maximising accuracy and efficiency of traffic accident prediction combining information mining with computational intelligence approaches and decision trees. J. Artif. Intell. Soft Comput. Res. 4(1), 31–42 (2014)

    Article  Google Scholar 

  20. El-Samak, A.F., Ashour, W.: Optimization of traveling salesman problem using affinity propagation clustering and genetic algorithm. J. Artif. Intell. Soft Comput. Res. 5(4), 239–245 (2015)

    Article  Google Scholar 

  21. Woźniak, M., Kempa, W.M., Gabryel, M., Nowicki, R.K.: A finite-buffer queue with single vacation policy - analytical study with evolutionary positioning. Int. J. Appl. Math. Comput. Sci. 24(4), 887–900 (2014)

    MathSciNet  MATH  Google Scholar 

  22. Gabryel, M., Grycuk, R., Korytkowski, M., Holotyak, T.: Image indexing and retrieval using GSOM algorithm. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015. LNCS, vol. 9119, pp. 706–714. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  23. Grycuk, R., Gabryel, M., Korytkowski, M., Scherer, R., Voloshynovskiy, S.: From single image to list of objects based on edge and blob detection. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014, Part II. LNCS, vol. 8468, pp. 605–615. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  24. Gabryel, M., Woźniak, M., Damaševičius, R.: An application of differential evolution to positioning queueing systems. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015. LNCS, vol. 9120, pp. 379–390. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  25. Nowak, B.A., Nowicki, R.K., Woźniak, M., Napoli, C.: Multi-class nearest neighbour classifier for incomplete data handling. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015. LNCS, vol. 9119, pp. 469–480. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  26. Nowicki, R.K., Nowak, B.A., Woźniak, M.: Application of rough sets in k nearest neighbours algorithm for classification of incomplete samples. In: Kunifuji, S., Papadopoulos, G.A., Skulimowski, A.M.J., Kacprzyk, J. (eds.) KICSS 2014. AISC, vol. 416, pp. 243–257. Springer, Heidelberg (2016)

    Google Scholar 

  27. Połap, D., Woźniak, M., Napoli, C., Tramontana, E.: Real-time cloud-based game management system via cuckoo search algorithm. Int. J. Electron. Telecommun. 61(4), 333–338 (2015)

    Google Scholar 

  28. Połap, D., Woźniak, M., Napoli, C., Tramontana, E.: Is swarm intelligence able to create mazes? Int. J. Electron. Telecommun. 61(4), 305–310 (2015)

    Google Scholar 

  29. Woźniak, M., Gabryel, M., Nowicki, R.K., Nowak, B.A.: An application of firefly algorithm to position traffic in NoSQL database systems. In: Kunifuji, S., Papadopoulos, G.A., Skulimowski, A.M.J., Kacprzyk, J. (eds.) KICSS 2014. AISC, vol. 416, pp. 259–272. Springer, Heidelberg (2016)

    Google Scholar 

  30. Woźniak, M., Marszałek, Z., Gabryel, M., Nowicki, R.K.: Preprocessing large data sets by the use of quick sort algorithm. In: Skulimowski, A.M.J., Kacprzyk, J. (eds.) KICSS 2013. AISC, vol. 364, pp. 111−121. Springer, Heidelberg (2016)

    Google Scholar 

  31. Woźniak, M., Marszałek, Z., Gabryel, M., Nowicki, R.K.: Modified merge sort algorithm for large scale data sets. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part II. LNCS, vol. 7895, pp. 612–622. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  32. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)

    Article  MATH  Google Scholar 

  33. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Networks Learn. Syst. 26(5), 1048–1059 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcin Gabryel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Gabryel, M. (2016). A Bag-of-Features Algorithm for Applications Using a NoSQL Database. In: Dregvaite, G., Damasevicius, R. (eds) Information and Software Technologies. ICIST 2016. Communications in Computer and Information Science, vol 639. Springer, Cham. https://doi.org/10.1007/978-3-319-46254-7_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46254-7_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46253-0

  • Online ISBN: 978-3-319-46254-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics