A Static Birthmark of Binary Executables Based on API Call Structure

  • Seokwoo Choi
  • Heewan Park
  • Hyun-il Lim
  • Taisook Han
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4846)


A software birthmark is a unique characteristic of a program that can be used as a software theft detection. In this paper we suggest and empirically evaluate a static birthmark of binary executables based on API call structure. The program properties employed in this birthmark are functions and standard API calls when the functions are executed. The API calls from a function includes the API calls explicitly found from the function and its descendants within limited depth in the call graph. To statically identify functions, call graphs and API calls, we utilizes IDAPro disassembler and its plug-ins. We define the similarity between two functions as the proportion of the number of all API calls to the number of the common API calls. The similarity between two programs is obtained by the maximum weight bipartite matching between two programs using the function similarity matrix. To show the credibility of the proposed techniques, we compare the same applications with different versions and the various types of applications which include text editors, picture viewers, multimedia players, P2P applications and ftp clients. To show the resilience, we compare binary executables compiled from various compilers. The empirical result shows that the similarities obtained using our birthmark sufficiently indicates the functional and structural similarities among programs.


software piracy software birthmark binary analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Schleimer, S., Wilkerson, D., Aiken, A.: Winnowing: local algorithms for document fingerprinting. In: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 76–85. ACM Press, New York (2003)CrossRefGoogle Scholar
  2. 2.
    Wise, M.: YAP3: improved detection of similarities in computer program and other texts. In: Proceedings of the twenty-seventh SIGCSE technical symposium on Computer science education, pp. 130–134 (1996)Google Scholar
  3. 3.
    Prechelt, L., Malpohl, G., Philippsen, M.: Finding plagiarisms among a set of programs with JPlag. Journal of Universal Computer Science 8(11), 1016–1038 (2002)Google Scholar
  4. 4.
  5. 5.
    Using BinDiff for Code theft detection,
  6. 6.
    Tamada, H., Okamoto, K., Nakamura, M., Monden, A., Matsumoto, K.: Dynamic Software Birthmarks to Detect the Theft of Windows Applications. International Symposium on Future Software Technology 20(22) (2004)Google Scholar
  7. 7.
    Okamoto, K., Tamada, H., Nakamura, M., Monden, A., Matsumoto, K.: Dynamic Software Birthmarks Based on API Calls. IEICE Transactions on Information and Systems 89(8), 1751–1763 (2006)Google Scholar
  8. 8.
    The IDA Pro Disassembler and Debugger,
  9. 9.
    Collberg, C., Thomborson, C.: Watermarking, tamper-proofing, and obfuscation-tools for software protection. Software Engineering, IEEE Transactions on 28(8), 735–746 (2002)CrossRefGoogle Scholar
  10. 10.
    Collberg, C., Thomborson, C.: Software watermarking: models and dynamic embeddings. In: Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 311–324. ACM Press, New York (1999)CrossRefGoogle Scholar
  11. 11.
    Collberg, C., Myles, G., Huntwork, A.: Sandmark-A tool for software protection research. Security & Privacy Magazine, IEEE 1(4), 40–49 (2003)CrossRefGoogle Scholar
  12. 12.
    Tamada, H., Nakamura, M., Monden, A., Matsumoto, K.: Design and evaluation of birthmarks for detecting theft of java programs. In: Proc. IASTED International Conference on Software Engineering (IASTED SE 2004), pp. 569–575 (2004)Google Scholar
  13. 13.
    Tamada, H., Nakamura, M., Monden, A., Matsumoto, K.: Java Birthmarks–Detecting the Software Theft–. IEICE Transactions on Information and Systems 88(9), 2148–2158 (2005)CrossRefGoogle Scholar
  14. 14.
    Myles, G., Collberg, C.: K-gram based software birthmarks. In: Proceedings of the 2005 ACM symposium on Applied computing, pp. 314–318. ACM Press, New York (2005)CrossRefGoogle Scholar
  15. 15.
    Myles, G., Collberg, C.: Detecting software theft via whole program path birthmarks. Information Security Conference, 404–415 (2004)Google Scholar
  16. 16.
    Myles, G.M.: Software Theft Detection Through Program Identification. PhD thesis, Department of Computer Science, The University of Arizona (2006)Google Scholar
  17. 17.
    Larus, J.: Whole program paths. In: Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, pp. 259–269. ACM Press, New York (1999)CrossRefGoogle Scholar
  18. 18.
    Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters 19(3-4), 255–259 (1998)zbMATHCrossRefGoogle Scholar
  19. 19.
    Schuler, D., Dallmeier, V., Lindig, C.: A Dynamic Birthmark for Java. In: Proceedings of the 22nd IEEE/ACM International Conference on Automated Software EngineeringGoogle Scholar
  20. 20.
    Kuhn, H.: The Hungarian method for the assignment problem. Naval Research Logistics 52(1), 7–21 (2005)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Kibria, R.: frhed - free hex editor,
  22. 22.
  23. 23.
    Wang, C.: A Security Architecture for Survivability Mechanisms. PhD thesis, University of VirginiaGoogle Scholar
  24. 24.
    Balakrishnan, G., Reps, T.: Recency-abstraction for heap-allocated storage. Static Analysis Symp. (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Seokwoo Choi
    • 1
  • Heewan Park
    • 1
  • Hyun-il Lim
    • 1
  • Taisook Han
    • 1
  1. 1.Division of Computer Science and Advanced Information Technology Research Center(AITrc)., Korea Advanced Institute of Science and Technology 

Personalised recommendations