Skip to main content

An Incremental Approach to Semantic Clustering Designed for Software Visualization

  • Conference paper
  • First Online:
Proceedings of the 2015 Federated Conference on Software Development and Object Technologies (SDOT 2015)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 511))

Included in the following conference series:

Abstract

In this paper, we introduce an incremental approach to semantic clustering, designed for software visualization, inspired by behavior of fire ant colony. Our technique focus on identification of equally sized but natural clusters that provides better hindsight of software system structure for development participants. We also address performance issues of existing approaches by maintaining similarities based on global weights incrementally, using subspaces and covariance matrix. Effectivity of visualization is improved by representing multiple documents with precise medoid approximation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. DeLine, R., Rowan, K.: Code canvas: zooming towards better development environments. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ICSE 2010), vol. 2, pp. 207–210. ACM, New York (2010)

    Google Scholar 

  2. Asuncion, H.U., Asuncion, A.U., Taylor, R.N.: Software traceability with topic modeling. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ICSE 2010), vol. 1, pp. 95–104. ACM, New York (2010)

    Google Scholar 

  3. Kuhn, A., Ducasse, S., Gírba, T.: Semantic clustering: identifying topics in source code. Inf. Softw. Technol. 49(3), 230–243 (2007)

    Article  Google Scholar 

  4. Uhlár, M., Polasek, I.: Extracting, identifiyng and visualisation of the content in software projects. In: Proceedings of the 4th World Congress on Nature and Biologically Inspired Computing (NaBIC 2012), November 2012, pp. 72–78. IEEE Press (2012)

    Google Scholar 

  5. Linstead, E., Rigor, P., Bajracharya, S., Lopes, C., Baldi, P.: Mining concepts from code with probabilistic topic models. In: Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007), pp. 461–464. ACM, New York (2007)

    Google Scholar 

  6. Blei, D. M., Ng, A. Y., Jordan, M. I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (March 2003), 993–1022March

    Google Scholar 

  7. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Article  Google Scholar 

  8. Ackermann, M.R., Märtens, M., Raupach, C., Swierkot, K., Lammersen, C., Sohler, C.: StreamKM++: a clustering algorithm for data streams. J. Exp. Algorithmics 17, 1–31 (2012). Article 2.4

    Article  MathSciNet  MATH  Google Scholar 

  9. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. SIGMOD Rec. 25(2), 103–114 (1996)

    Article  Google Scholar 

  10. Grygorash, O., Zhou, Y., Jorgensen, Z.: Minimum spanning tree based clustering algorithms. In: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2006), pp. 73–81. IEEE Computer Society, Washington, DC (2006)

    Google Scholar 

  11. Gower, J.C., Ross, G.J.S.: Minimum spanning trees and single linkage cluster analysis. Appl. Stat. 18, 54–64 (1969)

    Article  MathSciNet  Google Scholar 

  12. Jafar, O.M., Sivakumar, R.: Ant-based clustering algorithms a brief survey. Int. J. Comput. Theor. Eng. 2(5), 787–796 (2010)

    Article  Google Scholar 

  13. Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. 20(1), 68–86 (1971)

    Article  MATH  Google Scholar 

  14. Mlot, N.J., Tovey, C.A., Hu, D.L.: Fire ants self-assemble into waterproof rafts to survive floods. Proc. Natl. Acad. Sci. USA 108(19), 7669–7673 (2011)

    Article  Google Scholar 

  15. Paltoglou, G., Thelwall, M.: A study of information retrieval weighting schemes for sentiment analysis. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), pp. 1386–1395. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  16. Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.-M., Irwin, J.: Aspect-oriented programming. In: Akşit, M., Matsuoka, S. (eds.) ECOOP 1997. LNCS, vol. 1241, pp. 220–242. Springer, Heidelberg (1997). doi:10.1007/BFb0053381

    Google Scholar 

  17. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Inc., New York (1986)

    MATH  Google Scholar 

  18. Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998), pp. 315–323. ACM, New York (1998)

    Google Scholar 

  19. Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1996), pp. 21–29. ACM, New York (1996)

    Google Scholar 

  20. Nešetřil, J., Milková, E., Nešetřilová, H.: Otakar Borůvka on minimum spanning tree problem translation of both the 1926 papers, comments, history. Discrete Math. 233, 1–3, 3–36 (2001)

    Google Scholar 

  21. Polášek, I., Uhlár, M.: Extracting, identifying and visualisation of the content, users and authors in software projects. In: Gavrilova, M.L., Tan, C.J.K., Abraham, A. (eds.) Transactions on Computational Science XXI. LNCS, vol. 8160, pp. 269–295. Springer, Heidelberg (2013). doi:10.1007/978-3-642-45318-2_12

    Chapter  Google Scholar 

  22. Gregorovic, L., Polasek, I.: Analysis and design of object-oriented software using multidimensional UML. In: Proceedings of the 15th International Conference on Knowledge Technologies and Data-Driven Business (i-KNOW 2015). ACM, New York (2015)

    Google Scholar 

  23. Gregorovič, L., Polasek, I., Sobota, B.: Software model creation with multidimensional UML. In: Khalil, I., Neuhold, E., Tjoa, A.M., Da Xu, L., You, I. (eds.) CONFENIS/ICT-EurAsia -2015. LNCS, vol. 9357, pp. 343–352. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24315-3_35

    Chapter  Google Scholar 

  24. Polasek, I., et al.: Information and knowledge retrieval within software projects and their graphical representation for collaborative programming. Acta Polytech. Hung. 10(2), 173–192 (2013)

    MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was supported by the Scientific Grant Agency of Slovak Republic (VEGA) under the grant No. VG 1/1221/12. This contribution is also a partial result of the Research & Development Operational Programme for the project Research of Methods for Acquisition, Analysis and Personalized Conveying of Information and Knowledge, ITMS 26240220039, co-funded by the ERDF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juraj Vincúr .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Vincúr, J., Polášek, I. (2017). An Incremental Approach to Semantic Clustering Designed for Software Visualization. In: Janech, J., Kostolny, J., Gratkowski, T. (eds) Proceedings of the 2015 Federated Conference on Software Development and Object Technologies. SDOT 2015. Advances in Intelligent Systems and Computing, vol 511. Springer, Cham. https://doi.org/10.1007/978-3-319-46535-7_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46535-7_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46534-0

  • Online ISBN: 978-3-319-46535-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics