Breaking reCAPTCHA: A Holistic Approach via Shape Recognition

Baecher, Paul; Büscher, Niklas; Fischlin, Marc; Milde, Benjamin

doi:10.1007/978-3-642-21424-0_5

Paul Baecher⁵,
Niklas Büscher⁵,
Marc Fischlin⁵ &
…
Benjamin Milde⁵

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 354))

Included in the following conference series:

IFIP International Information Security Conference

1851 Accesses
21 Citations
3 Altmetric

Abstract

CAPTCHAs are small puzzles which should be easily solvable by human beings but hard to solve for computers. They build a security cornerstone of the modern Internet service landscape, deployed in essentially any kind of login service, allowing to distinguish authorized human beings from automated attacks. One of the most popular and successful systems today is reCAPTCHA. As many other systems, reCAPTCHA is based on distorted images of words, where the distortion system evolves over time and determines different generations of the system. In this work, we analyze three recent generations of reCAPTCHA and present an algorithm that is capable of solving at least 5% of the challenges generated by these versions. We achieve this by applying a specialized variant of shape contexts proposed by Belongie et al. to match entire words at once. In order to handle the ellipse shaped distortions employed in one of the generations, we propose a machine learning algorithm that virtually eliminates the distortion. Finally, an improved shape matching strategy allows us to use word dictionaries of a reasonable size (with approximately 20,000 entries).

Download to read the full chapter text

Chapter PDF

CAPTCHaStar! A Novel CAPTCHA Based on Interactive Shape Discovery

DotCHA: A 3D Text-Based Scatter-Type CAPTCHA

Automatic Identification of CAPTCHA Schemes

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: Human-based character recognition via web security measures. Science 321(5895), 1465–1468 (2008) Cited on page 1
Article MathSciNet Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape context: A new descriptor for shape matching and object recognition. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) NIPS, pp. 831–837. MIT Press, Cambridge (2000) Cited on pages 2 and 4
Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 679–698 (1986), http://portal.acm.org/citation.cfm?id=11274.11275 Cited on page 6
Article Google Scholar
Chellapilla, K., Larson, K., Simard, P.Y., Czerwinski, M.: Building segmentation based human-friendly human interaction proofs (HIPs). In: Baird, H.S., Lopresti, D.P. (eds.) HIP 2005. LNCS, vol. 3517, pp. 1–26. Springer, Heidelberg (2005) Cited on page 4
Chapter Google Scholar
Chellapilla, K., Larson, K., Simard, P.Y., Czerwinski, M.: Computers beat humans at single character recognition in reading based human interaction proofs (HIPs). In: CEAS (2005) Cited on page 4
Google Scholar
Govindaraju, V., Krishnamurthy, R.K.: Holistic handwritten word recognition using temporal features derived from off-line images. Pattern Recognition Letters 17(5), 537–540 (1996) Cited on page 5
Article Google Scholar
Houck, C.W.: Decoding recaptcha (2010), http://www.n3on.org/projects/reCAPTCHA/docs/reCAPTCHA.docx Cited on pages 3 and 6
Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: DIAL, pp. 278–287. IEEE Computer Society Press, Los Alamitos (2004) Cited on page 5
Google Scholar
Lladós, J., Roy, P.P., Rodríguez, J.A., Sánchez, G.: Word spotting in archive documents using shape contexts. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds.) IbPRIA 2007. LNCS, vol. 4478, pp. 290–297. Springer, Heidelberg (2007) Cited on page 4
Chapter Google Scholar
Madhvanath, S., Govindaraju, V.: Contour-based image preprocessing for holistic handwritten word recognition. In: ICDAR, pp. 536–539. IEEE Computer Society Press, Los Alamitos (1997) Cited on page 5
Google Scholar
Madhvanath, S., Govindaraju, V.: The role of holistic paradigms in handwritten word recognition. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 149–164 (2001) Cited on page 5
Article Google Scholar
Mori, G., Belongie, S., Malik, J.: Shape contexts enable efficient retrieval of similar shapes. In: CVPR, vol. 1, pp. 723–730. IEEE Computer Society Press, Los Alamitos (2001) Cited on page 4
Google Scholar
Mori, G., Belongie, S.J., Malik, J.: Efficient shape matching using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 27(11), 1832–1837 (2005) Cited on page 9
Google Scholar
Mori, G., Malik, J.: Recognizing objects in adversarial clutter: Breaking a visual CAPTCHA. In: CVPR, vol. 1, pp. 134–144. IEEE Computer Society Press, Los Alamitos (2003) Cited on page 4
Google Scholar
Vertanen, K.: Words in 10 lists (2010), http://www.keithv.com/software/ Cited on page 10
Wilkins, J.: Strong CAPTCHA guidelines v1.2 (2009), http://www.bitland.net/ Cited on page 3

Download references

Author information

Authors and Affiliations

Darmstadt University of Technology, Germany
Paul Baecher, Niklas Büscher, Marc Fischlin & Benjamin Milde

Authors

Paul Baecher
View author publications
You can also search for this author in PubMed Google Scholar
Niklas Büscher
View author publications
You can also search for this author in PubMed Google Scholar
Marc Fischlin
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Milde
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM Zurich Research Laboratory, Säumerstr. 4, 8803, Rüschlikon, Switzerland
Jan Camenisch
Department of Computer Science, Karlstad University, Universitetsgatan 1, 65188, Karlstad, Sweden
Simone Fischer-Hübner
Faculty of Software and Information Science, Iwate Prefectural University, 152-52 Sugo, Takizawa, 020-0173, Takizawa-mura, Iwate, Japan
Yuko Murayama
Lucerne University of Applied Sciences and Arts, Zentralstr. 9, 6002, Lucerne, Switzerland
Armand Portmann & Carlos Rieder &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Baecher, P., Büscher, N., Fischlin, M., Milde, B. (2011). Breaking reCAPTCHA: A Holistic Approach via Shape Recognition. In: Camenisch, J., Fischer-Hübner, S., Murayama, Y., Portmann, A., Rieder, C. (eds) Future Challenges in Security and Privacy for Academia and Industry. SEC 2011. IFIP Advances in Information and Communication Technology, vol 354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21424-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-21424-0_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21423-3
Online ISBN: 978-3-642-21424-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Breaking reCAPTCHA: A Holistic Approach via Shape Recognition

Abstract

Chapter PDF

Similar content being viewed by others

CAPTCHaStar! A Novel CAPTCHA Based on Interactive Shape Discovery

DotCHA: A 3D Text-Based Scatter-Type CAPTCHA

Automatic Identification of CAPTCHA Schemes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Breaking reCAPTCHA: A Holistic Approach via Shape Recognition

Abstract

Chapter PDF

Similar content being viewed by others

CAPTCHaStar! A Novel CAPTCHA Based on Interactive Shape Discovery

DotCHA: A 3D Text-Based Scatter-Type CAPTCHA

Automatic Identification of CAPTCHA Schemes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation