Abstract
To developers software often appears as a large number of modules each containing hundreds of lines of code. It is, in general, not obvious which parts of the source code implement a given concern. Typically, existing documentation is outdated (if it exists at all), the system’s original architects are no longer available, or their view is outdated due to changes made by others. However, even out-of-date documentation has value, particularly if the high-level abstraction has remained valid. If it is complemented with an up-to-date, more detailed overview of the concerns in the source code, the two information sources, documentation and code, together can be used for everyday maintenance tasks. A technique which has been proposed to provide such a complementary view is Latent Semantic Indexing (LSI). LSI arose from the problem of how to find relevant documents from search words and assumes there is some underlying or latent structure in word usage across documents. In our case, LSI is used to identify architectural concerns which reveal the intention of the source code based on the words occurring in that source code. However, the multitude of levers and knobs in the various steps of the approach makes it hard to use LSI as an off-the-shelf tool. In this chapter we describe the steps of the approach together with appropriate settings. These settings proved to be successful when applying the approach on the source code of Philips Healthcare. Using the optimal settings for our case we have conducted two case studies at Philips Healthcare where we used the approach to complement the existing toolset for managing the system’s documentation in order to improve the documentation and communication on the software. Using our approach, we were able to identify multiple concerns which were not yet documented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In computer science, a concern is a particular set of behaviours needed by a computer program. A concern can be as general as database interaction or as specific as performing a calculation. LSI enables to identify the more general, architectural concerns.
- 2.
An introduction into these topics can be found here: http://en.wikipedia.org/wiki/MRI.
- 3.
References
Binkley D, Lawrie D (2009) Information retrieval applications in software maintenance and evolution. In: Laplante P (ed) Encyclopedia of Software Engineering. Taylor & Francis LLC
Boehm BW (1976) Software engineering. IEEE Trans Comput 25(12):1226–1241
Boehm BW, Brown J, Lipow M (1976) Quantitative evaluation of software quality. In: ICSE ’76, Proceedings of the 2nd international conference on software engineering. IEEE Computer Society Press, Los Alamitos, CA, USA, pp 592–605
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inform Sci 41(6):391–407
Eastwood A (1992) It’s a hard sell – and hard work too (software engineering). Comput Can 22(18):35
Kuhn A, Ducasse S, Gîrba T (2007) Semantic clustering: Identifying topics in source code. Inf Software Technol 49(3):230–243
Landauer TK, Foltz PW, Laham D (1998) Introduction to latent semantic analysis. Disc Proc 25:259–284
Lethbridge TC, Singer J, Forward A (2003) How software engineers use documentation: The state of the practice. IEEE Software 20(6):35–39
Lormans M, van Deursen A (2006) Can LSI help reconstructing requirements traceability in design and test? In: Proceedings of the conference on software maintenance and reengineering (CSMR’06). IEEE Computer Society, Washington, DC, USA, pp 47–56
Maletic J, Marcus A (2000) Using latent semantic analysis to identify similarities in source code to support program understanding. In: ICTAI’00. IEEE Computer Society, Los Alamitos, CA, USA, p 0046
Maletic JI, Marcus A (2001) Supporting program comprehension using semantic and structural information. In: Proceedings of the 23rd IEEE/ACM international conference on software engineering (ICSE’01). IEEE Computer Society, Washington, DC, USA, pp 103–112
Marcus A, Maletic J, Sergeyev A (2005) Recovery of traceability links between software documentation and source code. Int J Software Eng Knowl Eng 15(4):811–836
Spek Pvd, Klusener S, van de Laar P (2008) Towards recovering architectural concepts using latent semantic indexing. In: CSMR’08. IEEE Computer Society, Washington, DC, USA, pp 253–257
van der Spek P, Klusener S, van de Laar P (2010) Complementing software documentation: Testing the effectiveness of parameters for Latent Semantic Indexing. Submitted. Available at: http://www.cs.vu.nl/~pvdspek/files/complementing.pdf
Acknowledgements
We would like to thank Yolanda van Dinther, Linda van Sinten, Amit Ray and Gert Jan Kamstra (Philips Healthcare - PII) for their help in the ADE case study. We would also like to thank Eric Meijer, Marcin Grodek, Mathijs Visser and Ronald Holthuizen (Philips Healthcare - MR) with their help in the Diffusion processing study.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
van der Spek, P., Klusener, S., van de Laar, P. (2010). Complementing Software Documentation. In: Van de Laar, P., Punter, T. (eds) Views on Evolvability of Embedded Systems. Embedded Systems. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9849-8_3
Download citation
DOI: https://doi.org/10.1007/978-90-481-9849-8_3
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9848-1
Online ISBN: 978-90-481-9849-8
eBook Packages: EngineeringEngineering (R0)