Document recognition, semantics, and symbolic reasoning in reverse engineering of software
The SoftDocs project at Concordia University investigates knowledge acquisition from software documents and the analysis of that knowledge for reverse engineering of legacy systems. It focusses on the recognition and analysis of diagrams rather than natural language processing of textual components of a software document. Rigorous analysis of diagrams requires a formal semantics for them, and utilises tools for symbolic reasoning.
Data flow diagrams (DFDs) are one of many kinds of diagrams that software engineers use to help them understand complex systems. A data flow diagram represents a system as a network of processes connected by data flows. DFDs provide a useful and intuitive way of representing a system and they are easily interpreted by people. Without a formal semantics, however, DFDs cannot be used for automatic software understanding or reverse engineering. The goal of our research is to abstract meaning from existing diagrams, thereby enabling software tools, such as reverse engineering tools, to make use of existing diagrams.
We have previously described a formal semantics for DFDs based on Milner's Calculus of Communicating Systems (CCS). The resulting formal description of a DFD can be analyzed with the aid of the Edinburgh Concurrency Workbench (CWB).
A prototype tool, II-DFD, hides the details of the formal semantic notation and the commands of CWB. II-DFD provides engineers with the capability to analyze the structure and semantics of a DFD, to run simulations of the behavior of a DFD, and to display the results graphically. Semantic analysis includes the computation of a DFD's state space; finding a minimal representation for the state space; deciding whether two DFDs are equivalent; and whether one DFD is an abstraction of another, more detailed, DFD. All of these operations are potentially useful in reverse engineering and software understanding.
KeywordsReverse Engineering Formal Semantic Observable Action Prototype Tool Case Tool
Unable to display preview. Download preview PDF.
- 1.G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. Retrieving information from data flow diagrams. In Proceedings of Second Working Conference on Reverse Engineering, (Toronto, July 14–16, 1995). Linda Wills, Philip Newcomb, Elliot Chikofsky (eds), IEEE Computer Society Press, Los Alamitos, CA, 1995,pp. 22–29.Google Scholar
- 2.G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. Analyzing the logical structure of data flow diagrams. In Third International Conference on Document Analysis and Recognition, August 1995. Poster session.Google Scholar
- 3.G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. A Semantics of Data Flow Diagrams. Journal article in preparation.Google Scholar
- 4.G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. Knowledge and the recognition and understanding of software documents. Department of Computer Science, Concordia University, February 1995, 47 pages.Google Scholar
- 6.E. Chikofsky and J. Cross. Reverse engineering and design recovery: A taxonomy. IEEE Software, 7(1):13–17. 1990.Google Scholar
- 7.R. Cleaveland, J. Parrow, and B. Steffen. The concurrency workbench. In J. Sifakis, editor, Automatic Verification Methods for Finite State Systems, Lecture Notes in Computer Science vol. 407, pages 24-37. Springer-Verlag, 1987.Google Scholar
- 8.F. DeMarco. Structured Analysis and System Specification. Englewood Cliffs, N.J., Yourdon Press, 1978.Google Scholar
- 9.C. Gane and T. Sarson. Structured Systems Analysis. Englewood Cliffs, N.J., Prentice Hall, 1979.Google Scholar
- 10.M.L. Griss, Software reuse: From library to factory. IBM Systems Journal 32, 4 (1993) 548–566.Google Scholar
- 11.Hall, P. Overview of reverse engineering and reuse research. Information and Software Technology, 34(4):239–249. 1992.Google Scholar
- 12.D. Harel, H. Lachover, A Naamad, A. Pnueli, M. Politi, R. Sherman, and A. Shtul-Trauring. Statemate: A working environment for the development of complex reactive systems. In Proceedings of 10th International Conference on Software Engineering, pages 396–406. IEEE Press, 1988.Google Scholar
- 13.R. Kasturi, S.T. Bow, W. El-Masri, J. Shah, J.R. Gattiker, U.B. Mokate, A system for interpretation of line drawings. IEEE Transaction on Pattern Analysis and Machine Intelligence 12, 10 (1994) 978–992.Google Scholar
- 14.Charles W. Krueger, Software reuse. ACM Computing Surveys 24, 2 (June 1992) 131–183.Google Scholar
- 15.H. Mili, F. Mili, A. Mili, Reusing software: Issues and research directions. IEEE Trans. Software Eng. 21, 6 (June 1995) 528–562.Google Scholar
- 16.R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer-Verlag, Berlin-New York, 1980.Google Scholar
- 17.R. Milner. Communication and Concurrency. Prentice-Hall, Englewood Cliffs, N.J., 1989.Google Scholar
- 18.Faron Moller, The Edinburgh Concurrency Workbench (Version 6.1). Department of Computer Science, University of Edinburgh, October 1992.Google Scholar
- 19.J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-oriented Modelling and Design. Englewood Cliffs, N.J., Prentice Hall, 1991.Google Scholar
- 20.T.H. Tse and L. Pong. Towards a formal foundation for DeMarco data flow. The Computer Journal, 32(1):1–12, 1989.Google Scholar
- 21.A. Weinand, E. Gamma, R. Marty, Design and implementation of ET++, a seamless object-oriented application framework. Structured Programming, 10, 2 (1989) 63–87.Google Scholar
- 22.E. Yourdon and L.L Constantine. Structured Design: Fundamental of a Discipline of Computer Program and Systems Design. Englewood Cliffs, N.J., Prentice Hall, 1979.Google Scholar