Document recognition, semantics, and symbolic reasoning in reverse engineering of software

  • G. Butler
  • P. Grogono
  • R. Shinghal
  • I. TjandraEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1138)


The SoftDocs project at Concordia University investigates knowledge acquisition from software documents and the analysis of that knowledge for reverse engineering of legacy systems. It focusses on the recognition and analysis of diagrams rather than natural language processing of textual components of a software document. Rigorous analysis of diagrams requires a formal semantics for them, and utilises tools for symbolic reasoning.

Data flow diagrams (DFDs) are one of many kinds of diagrams that software engineers use to help them understand complex systems. A data flow diagram represents a system as a network of processes connected by data flows. DFDs provide a useful and intuitive way of representing a system and they are easily interpreted by people. Without a formal semantics, however, DFDs cannot be used for automatic software understanding or reverse engineering. The goal of our research is to abstract meaning from existing diagrams, thereby enabling software tools, such as reverse engineering tools, to make use of existing diagrams.

We have previously described a formal semantics for DFDs based on Milner's Calculus of Communicating Systems (CCS). The resulting formal description of a DFD can be analyzed with the aid of the Edinburgh Concurrency Workbench (CWB).

A prototype tool, II-DFD, hides the details of the formal semantic notation and the commands of CWB. II-DFD provides engineers with the capability to analyze the structure and semantics of a DFD, to run simulations of the behavior of a DFD, and to display the results graphically. Semantic analysis includes the computation of a DFD's state space; finding a minimal representation for the state space; deciding whether two DFDs are equivalent; and whether one DFD is an abstraction of another, more detailed, DFD. All of these operations are potentially useful in reverse engineering and software understanding.


Reverse Engineering Formal Semantic Observable Action Prototype Tool Case Tool 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. Retrieving information from data flow diagrams. In Proceedings of Second Working Conference on Reverse Engineering, (Toronto, July 14–16, 1995). Linda Wills, Philip Newcomb, Elliot Chikofsky (eds), IEEE Computer Society Press, Los Alamitos, CA, 1995,pp. 22–29.Google Scholar
  2. 2.
    G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. Analyzing the logical structure of data flow diagrams. In Third International Conference on Document Analysis and Recognition, August 1995. Poster session.Google Scholar
  3. 3.
    G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. A Semantics of Data Flow Diagrams. Journal article in preparation.Google Scholar
  4. 4.
    G. Butler, P.D. Grogono, R. Shinghal, and I.A. Tjandra. Knowledge and the recognition and understanding of software documents. Department of Computer Science, Concordia University, February 1995, 47 pages.Google Scholar
  5. 5.
    M.J. Chen and C.G Chung. Preventive structural analysis of dataflow diagrams. Information and Software Technology, 34(2):117–130, 1992.CrossRefGoogle Scholar
  6. 6.
    E. Chikofsky and J. Cross. Reverse engineering and design recovery: A taxonomy. IEEE Software, 7(1):13–17. 1990.Google Scholar
  7. 7.
    R. Cleaveland, J. Parrow, and B. Steffen. The concurrency workbench. In J. Sifakis, editor, Automatic Verification Methods for Finite State Systems, Lecture Notes in Computer Science vol. 407, pages 24-37. Springer-Verlag, 1987.Google Scholar
  8. 8.
    F. DeMarco. Structured Analysis and System Specification. Englewood Cliffs, N.J., Yourdon Press, 1978.Google Scholar
  9. 9.
    C. Gane and T. Sarson. Structured Systems Analysis. Englewood Cliffs, N.J., Prentice Hall, 1979.Google Scholar
  10. 10.
    M.L. Griss, Software reuse: From library to factory. IBM Systems Journal 32, 4 (1993) 548–566.Google Scholar
  11. 11.
    Hall, P. Overview of reverse engineering and reuse research. Information and Software Technology, 34(4):239–249. 1992.Google Scholar
  12. 12.
    D. Harel, H. Lachover, A Naamad, A. Pnueli, M. Politi, R. Sherman, and A. Shtul-Trauring. Statemate: A working environment for the development of complex reactive systems. In Proceedings of 10th International Conference on Software Engineering, pages 396–406. IEEE Press, 1988.Google Scholar
  13. 13.
    R. Kasturi, S.T. Bow, W. El-Masri, J. Shah, J.R. Gattiker, U.B. Mokate, A system for interpretation of line drawings. IEEE Transaction on Pattern Analysis and Machine Intelligence 12, 10 (1994) 978–992.Google Scholar
  14. 14.
    Charles W. Krueger, Software reuse. ACM Computing Surveys 24, 2 (June 1992) 131–183.Google Scholar
  15. 15.
    H. Mili, F. Mili, A. Mili, Reusing software: Issues and research directions. IEEE Trans. Software Eng. 21, 6 (June 1995) 528–562.Google Scholar
  16. 16.
    R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer-Verlag, Berlin-New York, 1980.Google Scholar
  17. 17.
    R. Milner. Communication and Concurrency. Prentice-Hall, Englewood Cliffs, N.J., 1989.Google Scholar
  18. 18.
    Faron Moller, The Edinburgh Concurrency Workbench (Version 6.1). Department of Computer Science, University of Edinburgh, October 1992.Google Scholar
  19. 19.
    J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-oriented Modelling and Design. Englewood Cliffs, N.J., Prentice Hall, 1991.Google Scholar
  20. 20.
    T.H. Tse and L. Pong. Towards a formal foundation for DeMarco data flow. The Computer Journal, 32(1):1–12, 1989.Google Scholar
  21. 21.
    A. Weinand, E. Gamma, R. Marty, Design and implementation of ET++, a seamless object-oriented application framework. Structured Programming, 10, 2 (1989) 63–87.Google Scholar
  22. 22.
    E. Yourdon and L.L Constantine. Structured Design: Fundamental of a Discipline of Computer Program and Systems Design. Englewood Cliffs, N.J., Prentice Hall, 1979.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • G. Butler
    • 1
  • P. Grogono
    • 1
  • R. Shinghal
    • 1
  • I. Tjandra
    • 1
    Email author
  1. 1.Department of Computer ScienceConcordia UniversityMontrealCanada

Personalised recommendations