Skip to main content

Multi-domain Hierarchical Free-Sketch Recognition Using Graphical Models

  • Chapter
Sketch-based Interfaces and Modeling
  • 960 Accesses

Abstract

In recent years there has been an increasing interest in sketch-based user interfaces, but the problem of robust free-sketch recognition remains largely unsolved. This chapter presents a graphical-model-based approach to free-sketch recognition that uses context to improve recognition accuracy without placing unnatural constraints on the way the user draws. Our approach uses context to guide the search for possible interpretations and uses a novel form of dynamically constructed Bayesian networks to evaluate these interpretations. An evaluation of this approach on two domains—family trees and circuit diagrams—reveals that in both domains the use of context to reclassify low-level shapes significantly reduces recognition error over a baseline system that does not reinterpret low-level classifications. Finally, we discuss an emerging technique to solve a major remaining challenge for multi-domain sketch recognition revealed by our evaluation: the problem of grouping strokes into individual symbols reliably and efficiently, without placing unnatural constraints on the user’s drawing style.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We provide enough background on Bayesian networks to give the reader a high-level understanding of our model. To understand the details, those unfamiliar with Bayesian networks are referred to [10] for an intuitive introduction and [30] for more details.

  2. 2.

    Throughout this section, t means true, and f means false.

  3. 3.

    To collect these sketches we asked users to perform synthesis tasks (i.e. not to copy pre-existing diagrams) and performed no recognition while they were sketching.

References

  1. Alvarado, C.: Multi-domain sketch understanding. PhD thesis, MIT (2004)

    Google Scholar 

  2. Alvarado, C., Davis, R.: Resolving ambiguities to create a natural sketch based interface. In: Proceedings of IJCAI-2001 (2001)

    Google Scholar 

  3. Alvarado, C., Davis, R.: Sketchread: A multi-domain sketch recognition engine. In: Proc. UIST (2004)

    Google Scholar 

  4. Alvarado, C., Davis, R.: Dynamically constructed Bayes nets for sketch understanding. In: Proceedings of IJCAI ’05 (2005)

    Google Scholar 

  5. Alvarado, C., Lazzareschi, M.: Properties of real-world digital logic diagrams. In: Proc. of the 1st International Workshop on Pen-Based Learning Technologies (PLT-07) (2007)

    Google Scholar 

  6. Bishop, C.M., Svensen, M., Hinton, G.E.: Distinguishing text from graphics in on-line handwritten ink. In: IWFHR ’04: Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition, pp. 142–147. IEEE Computer Society, Washington (2004)

    Chapter  Google Scholar 

  7. Blostein, D., Haken, L.: Using diagram generation software to improve diagram recognition: A case study of music notation. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(11) (1999)

    Google Scholar 

  8. Buxton, B.: Sketching User Experiences: Getting the Design Right and the Right Design. Morgan Kaufmann, San Mateo (2007)

    Google Scholar 

  9. Caetano, A., Goulart, N., Fonseca, M., Jorge, J.: Sketching user interfaces with visual patterns. In: Proceedings of the 1st Ibero-American Symposium in Computer Graphics (SIACG02), pp. 271–279 (2002)

    Google Scholar 

  10. Charniak, E.: Bayesian networks without tears: making Bayesian networks more accessible to the probabilistically unsophisticated. Artificial Intelligence 12(4), 50–63 (1991)

    Google Scholar 

  11. Cohen, P.R., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., Clow, J.: Quickset: Multimodal interaction for distributed applications. In: ACM Multimedia’97, pp. 31–40. ACM Press, New York (1997)

    Google Scholar 

  12. Do, E.Y.L., Gross, M.D.: Drawing as a means to design reasoning. AI and Design (1996)

    Google Scholar 

  13. Forbus, K.D., Usher, J., Chapman, V.: Sketching for military course of action diagrams. In: Proceedings of IUI (2003)

    Google Scholar 

  14. Forsberg, A.S., Dieterich, M.K., Zeleznik, R.C.: The music notepad. In: Proceedings of UIST ’98. ACM SIGGRAPH. ACM, New York (1998)

    Google Scholar 

  15. Futrelle, R.P., Nikolakis, N.: Efficient analysis of complex diagrams using constraint-based parsing. In: ICDAR-95 (International Conference on Document Analysis and Recognition), Montreal, Canada, pp. 782–790 (1995)

    Google Scholar 

  16. Gennari, L., Kara, L.B., Stahovich, T.F.: Combining geometry and domain knowledge to interpret hand-drawn diagrams. Computers and Graphics: Special Issue on Pen-Based User Interfaces (2005)

    Google Scholar 

  17. Getoor, L., Friedman, N., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: IJCAI, pp. 1300–1309 (1999). http://citeseer.nj.nec.com/friedman99learning.html

  18. Glessner, S., Koller, D.: Constructing flexible dynamic belief networks from first-order probabilistinc knowledge bases. In: Symbolic and Quantitative Approaches to Reasoning and Uncertainty, pp. 217–226 (1995)

    Google Scholar 

  19. Goldman, R.P., Charniak, E.: A language for construction of belief networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(3) (1993)

    Google Scholar 

  20. Grimson, W.E.L.: The combinatorics of heuristic search termination for object recognition in cluttered environments. IEEE Transactions on PAMI 13(9), 920–935 (1991)

    Article  MathSciNet  Google Scholar 

  21. Gross, M.D.: The electronic cocktail napkin—a computational environment for working with design diagrams. Design Studies 17, 53–69 (1996)

    Article  Google Scholar 

  22. Gross, M., Do, E.Y.L.: Ambiguous intentions: A paper-like interface for creative design. In: Proceedings of UIST 96, pp. 183–192 (1996)

    Google Scholar 

  23. Haddawy, P.: Generating Bayesian networks from probability logic knowledge bases. In: Proceedings of UAI ’94 (1994)

    Google Scholar 

  24. Hammond, T., Davis, R.: Tahuti: A geometrical sketch recognition system for UML class diagrams. In: AAAI Spring Symposium on Sketch Understanding, 59–68 (2002)

    Google Scholar 

  25. Hammond, T., Davis, R.: LADDER: A language to describe drawing, display, and editing in sketch recognition. In: Proceedings of the 2003 International Joint Conference on Artificial Intelligence (IJCAI) (2003)

    Google Scholar 

  26. Hammond, T., Davis, R.: Automatically transforming symbolic shape descriptions for use in sketch recognition. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04) (2004)

    Google Scholar 

  27. Hammond, T., Davis, R.: Interactive learning of structural shape descriptions from automatically generated near-miss examples. In: IUI ’06: Proceedings of the 11th International Conference on Intelligent User Interfaces, pp. 210–217. ACM, New York (2006)

    Google Scholar 

  28. Hse, H., Newton, A.R.: Recognition and beautification of multi-stroke symbols in digital ink. Computers and Graphics (2005)

    Google Scholar 

  29. Jensen, F.V., Lauritzen, S.L., Olesen, K.G.: Bayesian updating in causal probabilistic networks by local computations. Computational Statistics Quarterly 4, 269–282 (1990)

    MathSciNet  Google Scholar 

  30. Jensen, F.V.: Bayesian Networks and Decision Graphs. Statistics for Engineering and Information Science. Springer, Berlin (2001)

    MATH  Google Scholar 

  31. Kara, L.B., Stahovich, T.F.: Hierarchical parsing and recognition of hand-sketched diagrams. In: Proc. of UIST ’04 (2004)

    Google Scholar 

  32. Koller, D., Pfeffer, A.: Object-oriented Bayesian networks. In: Proceedings of the Thirteenth Annual Conference on Uncertainty, Providence, RI, pp. 302–313 (1997)

    Google Scholar 

  33. Labahn, G., MacLean, S., Marzouk, M., Rutherford, I., Tausky, D.: Mathbrush: An experimental pen-based math system. In: Dagstuhl Seminar Proceedings, Challenges in Symbolic Computation Software (2006)

    Google Scholar 

  34. Landay, J.A., Myers, B.A.: Interactive sketching for the early stages of user interface design. In: Proceedings of CHI ’95: Human Factors in Computing Systems, pp. 43–50 (1995)

    Google Scholar 

  35. Lank, E.H.: A retargetable framework for interactive diagram recognition. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR’03) (2003)

    Google Scholar 

  36. Lank, E., Thorley, J.S., Chen, S.J.S.: An interactive system for recognizing hand drawn UML diagrams. In: Proceedings for CASCON (2000)

    Google Scholar 

  37. Laskey, K.B., Mahoney, S.M.: Network fragments: Representing knowledge for constructing probabilistic models. In: Proceedings of UAI ’97 (1997)

    Google Scholar 

  38. Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society 50(2), 157–224 (1988)

    MathSciNet  MATH  Google Scholar 

  39. LaViola, J., Zeleznik, R.: Mathpad2: A system for the creation and exploration of mathematical sketches. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2004) 23(3) (2004)

    Google Scholar 

  40. Lu, W., Wu, W., Sakauchi, M.: A drawing recognition system with rule acquisition ability. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 1, pp. 512–515 (1995)

    Google Scholar 

  41. Matsakis, N.: Recognition of handwritten mathematical expressions. Master’s thesis, Massachusetts Institute of Technology (1999)

    Google Scholar 

  42. Newman, M.W., Lin, J., Hong, J.I., Landay, J.A.: DENIM: An informal Web site design tool inspired by observations of practice. Human-Computer Interaction 18(3), 259–324 (2003)

    Article  Google Scholar 

  43. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)

    Google Scholar 

  44. Pfeffer, A., Koller, D., Milch, B., Takusagawa, K.: SPOOK: A system for probabilistic object-oriented knowledge representation. In: Proceedings of UAI ’99, pp. 541–550 (1999)

    Google Scholar 

  45. Poole, D.: Probabilistic horn abduction and Bayesian networks. Artificial Intelligence (1993)

    Google Scholar 

  46. Saund, E., Fleet, D., Larner, D., Mahoney, J.: Perceptually supported image editing of text and graphics. In: Proceedings of UIST ’03 (2003)

    Google Scholar 

  47. Sezgin, T.M., Davis, R.: Sketch interpretation using multiscale models of temporal patterns. IEEE Computer Graphics and Applications 27(1), 28–37 (2007). doi:10.1109/MCG.2007.17

    Article  Google Scholar 

  48. Sezgin, T.M., Stahovich, T., Davis, R.: Sketch based interfaces: Early processing for sketch understanding. In: The Proceedings of 2001 Perceptive User Interfaces Workshop (PUI’01), Orlando, FL (2001)

    Google Scholar 

  49. Shilman, M., Pasula, H., Russell, S., Newton, R.: Statistical visual language models for ink parsing. In: Sketch Understanding, Papers from the 2002 AAAI Spring Symposium, pp. 126–132. AAAI Press, Stanford (2002)

    Google Scholar 

  50. Shilman, M., Viola, P., Chellapilla, K.: Recognition and grouping of handwritten text in diagrams and equations. In: Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR) (2004)

    Google Scholar 

  51. Stahovich, T., Davis, R., Shrobe, H.: Generating multiple new designs from a sketch. Artificial Intelligence 104(1–2), 211–264 (1998)

    Article  MATH  Google Scholar 

  52. Strat, T.M., Fischler, M.A.: Context-based vision: Recognizing objects using information from both 2-d and 3-d imagery. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(10), 1050–1065 (1991)

    Article  Google Scholar 

  53. Szummer, M., Qi, Y.: Contextual recognition of hand-drawn diagrams with conditional random fields. In: Proceedings of the 9th Int. Workshop on Frontiers in Handwriting Recognition (IWFHR), pp. 32–37 (2004)

    Google Scholar 

  54. Tenneson, D.: Technical report on the design and algorithms of chempad. Technical report, Brown University (2005)

    Google Scholar 

  55. Torralba, A., Sinha, P.: Statistical context priming for object detection. In: Proceedings of ICCV ’01, pp. 763–770 (2001)

    Google Scholar 

  56. Ullman, D.G., Wood, S., Craig, D.: The importance of drawing in the mechanical design process. Computers and Graphics 14(2), 263–274 (1990)

    Article  Google Scholar 

  57. Veselova, O., Davis, R.: Perceptually based learning of shape descriptions. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04) (2004)

    Google Scholar 

  58. Wang, X., Biswas, M., Raghupathy, S.: Addressing class distribution issues of the drawing vs writing classification in an ink stroke sequence. In: SBIM ’07: Proceedings of the 4th Eurographics Workshop on Sketch-based Interfaces and Modeling, pp. 139–146. ACM, New York (2007). doi:10.1145/1384429.1384458

    Chapter  Google Scholar 

  59. Weiss, Y.: Belief propagation and revision in networks with loops. Technical report, AI Memo No. 1616, CBCL Paper No. 155, Massachusetts Institute of Technology (1997)

    Google Scholar 

  60. Wolin, A., Hammond, T.: Shortstraw: A simple and effective corner finder for polylines. In: Alvarado, C., Cani, M.P. (eds.) Eurographics Workshop on Sketch-Based Interfaces and Modeling (SBIM) (2008)

    Google Scholar 

  61. Bayesian network tools in java (bnj). http://bnj.sourceforge.net

Download references

Acknowledgements

This work is based on my PhD thesis, supervised by Randall Davis at the Massachusetts Institute of Technology. Recent work is funded by an NSF CAREER award (IIS-0546809).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christine Alvarado .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Alvarado, C. (2011). Multi-domain Hierarchical Free-Sketch Recognition Using Graphical Models. In: Jorge, J., Samavati, F. (eds) Sketch-based Interfaces and Modeling. Springer, London. https://doi.org/10.1007/978-1-84882-812-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-84882-812-4_2

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84882-811-7

  • Online ISBN: 978-1-84882-812-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics