Skip to main content

Segmentation of Hanacaraka Characters Using Double Projection Profile and Hough Transform

  • Conference paper
  • First Online:
Book cover Big Data Technologies and Applications (BDTA 2017)

Abstract

In doing segmentation of Hanacaraka character, Javanese ancient character, one of Indonesian’s ethnic ancient character in Java island, the difficulties that occur is the inconsistency of the space between lines, the size of the character and the thickness. Inconsistencies between row spacing and letter size are caused by the letters of the pair, the last vowel and consonant letters in one phoneme. While the thickness is inconsistent due to the writing style of the Hanacaraka itself.

Image Preprocessing needs to be done to get input without skew. To improve skewed text documents, we used Hough transforms to predict the edges of the text area. After that, to segment the line and then continue with segmentation of each character, horizontal projection profile is used and then proceed with vertical.

The result of this segmentation method is good for printed documents. Segmentation process of handwriting documents has difficulty because each row in the document is uneven and very tight between the rows. Those matters cause them overlap. When the line segmented wrongly, the entire character on the line will be not segmented as well. This problem can be eliminate using connectivity test. Before this, it need to segment the line with the overlap area. The character part of below or above the main character can be eliminate because it is not connected to the main character.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 60.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akuntono, I.: Nasional – Kompas, 1 September 2012. Kompas Cyber Media. http://nasional.kompas.com/read/2012/09/01/12030360/Mau.Tahu.Jumlah.Ragam. Bahasa.di.Indonesia

  2. Sudirman, M.: Republika, 5 Maret 2013. http://www.republika.co.id/berita/koran/news-update/14/03/04/n1wzn0-bahasa-daerah-semakin-punah

  3. Agfa Monotype Corporation: Monotype, 1 January 2000. http://www.monotype.co.uk/NonLatin/wt_info/info_javanese.html

  4. Nbi: Hanacaraka, Aplikasi Android untuk Belajar Aksara Jawa, 20 Desember 2013. Tribunne News Cyber Media. http://jogja.tribunnews.com/2013/12/20/hanacaraka-aplikasi-android-untuk-belajar-aksara-jawa/

  5. Kishan, A.C., Sharda, V.: Skew Detection & Correction in Scanned Document Images. Department of Computer Science and Engineering, National Institute of Technology Rourkela, Orissa (2009)

    Google Scholar 

  6. Ramappa, M.H., Srikantamurthy, K.: Skew detection, correction and segmentation of handwritten Kannada document. Int. J. Adv. Sci. Technol. 48, 71–88 (2012)

    Google Scholar 

  7. Garg, R., Garg, N.K.: An algorithm for text line segmentation in handwritten skewed and overlapped Devanagari script. Int. J. Emerg. Technol. Adv. Eng. 4(5), 114–118 (2014)

    Google Scholar 

  8. Alginahi, Y.M.: A survey on Arabic character segmentation. Int. J. Doc. Anal. Recogn. 16(2), 105–126 (2013)

    Article  Google Scholar 

  9. Javed, M., Naghabushan, P., Chaudhuri, B.: Extraction of Projection Profile, Run-Histogram and Entropy Feature Straight from Run-Length Compressed Text Documents. Department of Studies in Computer Science, University of Mysore, Kolkata (2014)

    Google Scholar 

  10. Kumar, M., Jindal, M.K., Sharma, R.K.: Segmentation of isolated and touching characters in offline handwritten Gurmukhi script recognition. Int. J. Inform. Technol. Comput. Sci. 2, 58–63 (2014)

    Article  Google Scholar 

  11. Mamatha, H.R., Srikantamurthy, K.: Morphological operations and projection profiles based segmentation of handwritten Kannada document. Int. J. Appl. Inform. Syst. (IJAIS) 4(5), 13–19 (2012)

    Google Scholar 

  12. Mei, Y., Wang, X., Wang, J.: A Chinese character segmentation algorithm for complicated printed documents. Int. J. Sig. Process. Image Process. Pattern Recogn. 6(3), 91–100 (2013)

    Google Scholar 

  13. Tripathy, N., Pal, U.: Handwriting segmentation of unconstrained Oriya text. Sadhana 31(6), 755–769 (2006)

    Article  Google Scholar 

  14. BPAD Tentara Pelajar: Dongeng Koetjing Setiwelan. Yogyakarta, p. 5 (1992)

    Google Scholar 

  15. BPAD Tentara Pelajar: Langendriya, Yogyakarta, p. 7 (1938)

    Google Scholar 

  16. Rijksblad: Yogyakarta No 1, p. 9 (1936)

    Google Scholar 

Download references

Acknowledgment

This research was funded by DIPA Directorate General of Research and Development Reinforcement (Direktorat Jenderal Penguatan Riset dan Pengembangan) no. SP DIPA-042.06.1.401516/2017, fiscal year 2017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liliana Liliana .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liliana, L., Soephomo, S.M., Budhi, G.S., Adipranata, R. (2018). Segmentation of Hanacaraka Characters Using Double Projection Profile and Hough Transform. In: Jung, J., Kim, P., Choi, K. (eds) Big Data Technologies and Applications. BDTA 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 248. Springer, Cham. https://doi.org/10.1007/978-3-319-98752-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98752-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98751-4

  • Online ISBN: 978-3-319-98752-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics