Multi-domain Hierarchical Free-Sketch Recognition Using Graphical Models

Alvarado, Christine

doi:10.1007/978-1-84882-812-4_2

Christine Alvarado

960 Accesses

Abstract

In recent years there has been an increasing interest in sketch-based user interfaces, but the problem of robust free-sketch recognition remains largely unsolved. This chapter presents a graphical-model-based approach to free-sketch recognition that uses context to improve recognition accuracy without placing unnatural constraints on the way the user draws. Our approach uses context to guide the search for possible interpretations and uses a novel form of dynamically constructed Bayesian networks to evaluate these interpretations. An evaluation of this approach on two domains—family trees and circuit diagrams—reveals that in both domains the use of context to reclassify low-level shapes significantly reduces recognition error over a baseline system that does not reinterpret low-level classifications. Finally, we discuss an emerging technique to solve a major remaining challenge for multi-domain sketch recognition revealed by our evaluation: the problem of grouping strokes into individual symbols reliably and efficiently, without placing unnatural constraints on the user’s drawing style.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We provide enough background on Bayesian networks to give the reader a high-level understanding of our model. To understand the details, those unfamiliar with Bayesian networks are referred to [10] for an intuitive introduction and [30] for more details.
2.
Throughout this section, t means true, and f means false.
3.
To collect these sketches we asked users to perform synthesis tasks (i.e. not to copy pre-existing diagrams) and performed no recognition while they were sketching.

References

Alvarado, C.: Multi-domain sketch understanding. PhD thesis, MIT (2004)
Google Scholar
Alvarado, C., Davis, R.: Resolving ambiguities to create a natural sketch based interface. In: Proceedings of IJCAI-2001 (2001)
Google Scholar
Alvarado, C., Davis, R.: Sketchread: A multi-domain sketch recognition engine. In: Proc. UIST (2004)
Google Scholar
Alvarado, C., Davis, R.: Dynamically constructed Bayes nets for sketch understanding. In: Proceedings of IJCAI ’05 (2005)
Google Scholar
Alvarado, C., Lazzareschi, M.: Properties of real-world digital logic diagrams. In: Proc. of the 1st International Workshop on Pen-Based Learning Technologies (PLT-07) (2007)
Google Scholar
Bishop, C.M., Svensen, M., Hinton, G.E.: Distinguishing text from graphics in on-line handwritten ink. In: IWFHR ’04: Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition, pp. 142–147. IEEE Computer Society, Washington (2004)
Chapter Google Scholar
Blostein, D., Haken, L.: Using diagram generation software to improve diagram recognition: A case study of music notation. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(11) (1999)
Google Scholar
Buxton, B.: Sketching User Experiences: Getting the Design Right and the Right Design. Morgan Kaufmann, San Mateo (2007)
Google Scholar
Caetano, A., Goulart, N., Fonseca, M., Jorge, J.: Sketching user interfaces with visual patterns. In: Proceedings of the 1st Ibero-American Symposium in Computer Graphics (SIACG02), pp. 271–279 (2002)
Google Scholar
Charniak, E.: Bayesian networks without tears: making Bayesian networks more accessible to the probabilistically unsophisticated. Artificial Intelligence 12(4), 50–63 (1991)
Google Scholar
Cohen, P.R., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., Clow, J.: Quickset: Multimodal interaction for distributed applications. In: ACM Multimedia’97, pp. 31–40. ACM Press, New York (1997)
Google Scholar
Do, E.Y.L., Gross, M.D.: Drawing as a means to design reasoning. AI and Design (1996)
Google Scholar
Forbus, K.D., Usher, J., Chapman, V.: Sketching for military course of action diagrams. In: Proceedings of IUI (2003)
Google Scholar
Forsberg, A.S., Dieterich, M.K., Zeleznik, R.C.: The music notepad. In: Proceedings of UIST ’98. ACM SIGGRAPH. ACM, New York (1998)
Google Scholar
Futrelle, R.P., Nikolakis, N.: Efficient analysis of complex diagrams using constraint-based parsing. In: ICDAR-95 (International Conference on Document Analysis and Recognition), Montreal, Canada, pp. 782–790 (1995)
Google Scholar
Gennari, L., Kara, L.B., Stahovich, T.F.: Combining geometry and domain knowledge to interpret hand-drawn diagrams. Computers and Graphics: Special Issue on Pen-Based User Interfaces (2005)
Google Scholar
Getoor, L., Friedman, N., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: IJCAI, pp. 1300–1309 (1999). http://citeseer.nj.nec.com/friedman99learning.html
Glessner, S., Koller, D.: Constructing flexible dynamic belief networks from first-order probabilistinc knowledge bases. In: Symbolic and Quantitative Approaches to Reasoning and Uncertainty, pp. 217–226 (1995)
Google Scholar
Goldman, R.P., Charniak, E.: A language for construction of belief networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(3) (1993)
Google Scholar
Grimson, W.E.L.: The combinatorics of heuristic search termination for object recognition in cluttered environments. IEEE Transactions on PAMI 13(9), 920–935 (1991)
Article MathSciNet Google Scholar
Gross, M.D.: The electronic cocktail napkin—a computational environment for working with design diagrams. Design Studies 17, 53–69 (1996)
Article Google Scholar
Gross, M., Do, E.Y.L.: Ambiguous intentions: A paper-like interface for creative design. In: Proceedings of UIST 96, pp. 183–192 (1996)
Google Scholar
Haddawy, P.: Generating Bayesian networks from probability logic knowledge bases. In: Proceedings of UAI ’94 (1994)
Google Scholar
Hammond, T., Davis, R.: Tahuti: A geometrical sketch recognition system for UML class diagrams. In: AAAI Spring Symposium on Sketch Understanding, 59–68 (2002)
Google Scholar
Hammond, T., Davis, R.: LADDER: A language to describe drawing, display, and editing in sketch recognition. In: Proceedings of the 2003 International Joint Conference on Artificial Intelligence (IJCAI) (2003)
Google Scholar
Hammond, T., Davis, R.: Automatically transforming symbolic shape descriptions for use in sketch recognition. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04) (2004)
Google Scholar
Hammond, T., Davis, R.: Interactive learning of structural shape descriptions from automatically generated near-miss examples. In: IUI ’06: Proceedings of the 11th International Conference on Intelligent User Interfaces, pp. 210–217. ACM, New York (2006)
Google Scholar
Hse, H., Newton, A.R.: Recognition and beautification of multi-stroke symbols in digital ink. Computers and Graphics (2005)
Google Scholar
Jensen, F.V., Lauritzen, S.L., Olesen, K.G.: Bayesian updating in causal probabilistic networks by local computations. Computational Statistics Quarterly 4, 269–282 (1990)
MathSciNet Google Scholar
Jensen, F.V.: Bayesian Networks and Decision Graphs. Statistics for Engineering and Information Science. Springer, Berlin (2001)
MATH Google Scholar
Kara, L.B., Stahovich, T.F.: Hierarchical parsing and recognition of hand-sketched diagrams. In: Proc. of UIST ’04 (2004)
Google Scholar
Koller, D., Pfeffer, A.: Object-oriented Bayesian networks. In: Proceedings of the Thirteenth Annual Conference on Uncertainty, Providence, RI, pp. 302–313 (1997)
Google Scholar
Labahn, G., MacLean, S., Marzouk, M., Rutherford, I., Tausky, D.: Mathbrush: An experimental pen-based math system. In: Dagstuhl Seminar Proceedings, Challenges in Symbolic Computation Software (2006)
Google Scholar
Landay, J.A., Myers, B.A.: Interactive sketching for the early stages of user interface design. In: Proceedings of CHI ’95: Human Factors in Computing Systems, pp. 43–50 (1995)
Google Scholar
Lank, E.H.: A retargetable framework for interactive diagram recognition. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR’03) (2003)
Google Scholar
Lank, E., Thorley, J.S., Chen, S.J.S.: An interactive system for recognizing hand drawn UML diagrams. In: Proceedings for CASCON (2000)
Google Scholar
Laskey, K.B., Mahoney, S.M.: Network fragments: Representing knowledge for constructing probabilistic models. In: Proceedings of UAI ’97 (1997)
Google Scholar
Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society 50(2), 157–224 (1988)
MathSciNet MATH Google Scholar
LaViola, J., Zeleznik, R.: Mathpad2: A system for the creation and exploration of mathematical sketches. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2004) 23(3) (2004)
Google Scholar
Lu, W., Wu, W., Sakauchi, M.: A drawing recognition system with rule acquisition ability. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 1, pp. 512–515 (1995)
Google Scholar
Matsakis, N.: Recognition of handwritten mathematical expressions. Master’s thesis, Massachusetts Institute of Technology (1999)
Google Scholar
Newman, M.W., Lin, J., Hong, J.I., Landay, J.A.: DENIM: An informal Web site design tool inspired by observations of practice. Human-Computer Interaction 18(3), 259–324 (2003)
Article Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)
Google Scholar
Pfeffer, A., Koller, D., Milch, B., Takusagawa, K.: SPOOK: A system for probabilistic object-oriented knowledge representation. In: Proceedings of UAI ’99, pp. 541–550 (1999)
Google Scholar
Poole, D.: Probabilistic horn abduction and Bayesian networks. Artificial Intelligence (1993)
Google Scholar
Saund, E., Fleet, D., Larner, D., Mahoney, J.: Perceptually supported image editing of text and graphics. In: Proceedings of UIST ’03 (2003)
Google Scholar
Sezgin, T.M., Davis, R.: Sketch interpretation using multiscale models of temporal patterns. IEEE Computer Graphics and Applications 27(1), 28–37 (2007). doi:10.1109/MCG.2007.17
Article Google Scholar
Sezgin, T.M., Stahovich, T., Davis, R.: Sketch based interfaces: Early processing for sketch understanding. In: The Proceedings of 2001 Perceptive User Interfaces Workshop (PUI’01), Orlando, FL (2001)
Google Scholar
Shilman, M., Pasula, H., Russell, S., Newton, R.: Statistical visual language models for ink parsing. In: Sketch Understanding, Papers from the 2002 AAAI Spring Symposium, pp. 126–132. AAAI Press, Stanford (2002)
Google Scholar
Shilman, M., Viola, P., Chellapilla, K.: Recognition and grouping of handwritten text in diagrams and equations. In: Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR) (2004)
Google Scholar
Stahovich, T., Davis, R., Shrobe, H.: Generating multiple new designs from a sketch. Artificial Intelligence 104(1–2), 211–264 (1998)
Article MATH Google Scholar
Strat, T.M., Fischler, M.A.: Context-based vision: Recognizing objects using information from both 2-d and 3-d imagery. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(10), 1050–1065 (1991)
Article Google Scholar
Szummer, M., Qi, Y.: Contextual recognition of hand-drawn diagrams with conditional random fields. In: Proceedings of the 9th Int. Workshop on Frontiers in Handwriting Recognition (IWFHR), pp. 32–37 (2004)
Google Scholar
Tenneson, D.: Technical report on the design and algorithms of chempad. Technical report, Brown University (2005)
Google Scholar
Torralba, A., Sinha, P.: Statistical context priming for object detection. In: Proceedings of ICCV ’01, pp. 763–770 (2001)
Google Scholar
Ullman, D.G., Wood, S., Craig, D.: The importance of drawing in the mechanical design process. Computers and Graphics 14(2), 263–274 (1990)
Article Google Scholar
Veselova, O., Davis, R.: Perceptually based learning of shape descriptions. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04) (2004)
Google Scholar
Wang, X., Biswas, M., Raghupathy, S.: Addressing class distribution issues of the drawing vs writing classification in an ink stroke sequence. In: SBIM ’07: Proceedings of the 4th Eurographics Workshop on Sketch-based Interfaces and Modeling, pp. 139–146. ACM, New York (2007). doi:10.1145/1384429.1384458
Chapter Google Scholar
Weiss, Y.: Belief propagation and revision in networks with loops. Technical report, AI Memo No. 1616, CBCL Paper No. 155, Massachusetts Institute of Technology (1997)
Google Scholar
Wolin, A., Hammond, T.: Shortstraw: A simple and effective corner finder for polylines. In: Alvarado, C., Cani, M.P. (eds.) Eurographics Workshop on Sketch-Based Interfaces and Modeling (SBIM) (2008)
Google Scholar
Bayesian network tools in java (bnj). http://bnj.sourceforge.net

Download references

Acknowledgements

This work is based on my PhD thesis, supervised by Randall Davis at the Massachusetts Institute of Technology. Recent work is funded by an NSF CAREER award (IIS-0546809).

Author information

Authors and Affiliations

Authors

Christine Alvarado
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christine Alvarado .

Editor information

Editors and Affiliations

Instituto Superior Técnico, Depto. Engenharia Informática, Universidade Técnica de Lisboa, Avenida Rovisco Pais, Lisboa, 1049-001, Portugal
Joaquim Jorge
Dept. Computer Science, University of Calgary, University Drive NW 2500, Calgary, T2N 1N4, Alberta, Canada
Faramarz Samavati

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Alvarado, C. (2011). Multi-domain Hierarchical Free-Sketch Recognition Using Graphical Models. In: Jorge, J., Samavati, F. (eds) Sketch-based Interfaces and Modeling. Springer, London. https://doi.org/10.1007/978-1-84882-812-4_2

Download citation

DOI: https://doi.org/10.1007/978-1-84882-812-4_2
Publisher Name: Springer, London
Print ISBN: 978-1-84882-811-7
Online ISBN: 978-1-84882-812-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics