Skip to main content

Improved Identification of Tweets that Mention Books: Selection of Effective Features

  • Conference paper
  • First Online:
Digital Libraries: Knowledge, Information, and Data in an Open Access Society (ICADL 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10075))

Included in the following conference series:

  • 2270 Accesses

Abstract

In this paper, we assessed the effectiveness of different types of features for the identification of tweets on Twitter that mention books among tweets that contain the same strings as full book titles. In the previous work, the bag-of-words based features were taken from the context of individual tweets. While performance was reasonable, we identified room for improvement in terms of the extraction of features. We proposed additional types of features such as words appearing in the profiles of tweet authors, POS tags of mentioned book titles, and bibliographic elements within tweets, e.g. authors and publishers. We conducted a grid search for all combinations of the above feature sets, and observed performance improvements suitable for practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The number of local bookstores is rapidly decreasing in Japan. In 1999, there were 22,296 bookstores in Japan, and the number had fallen to 13,488 by 2015.

  2. 2.

    This is the only verb among the five keywords; all the others are nouns.

  3. 3.

    We utilised the readability algorithm of arc90.

  4. 4.

    Taking into account the fact that the number of TMBs is not so great, recall is important. However, the lack of precision greatly hampers the mission of the system.

References

  1. Prasetyo, P.K., Lo, D., Achananuparp, P., Tian, Y., Lim, E.P.: Automatic Classification of Software Related Microblogs. In: 28th International Conference on Software Maintenance, pp. 596–599. IEEE (2012)

    Google Scholar 

  2. Theodotou, A., Stassopoulou, A.: A system for automatic classification of twitter messages into categories. In: Christiansen, H., Stojanovic, I., Papadopoulos, G.A. (eds.) CONTEXT 2015. LNCS (LNAI), vol. 9405, pp. 532–537. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25591-0_44

    Chapter  Google Scholar 

  3. Tuarob, S., Tucker, C.S., Salathe, M., Ram, N.: An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages. J. Biomed. Inf. 49, 255–268 (2014)

    Article  Google Scholar 

  4. Yada, S.: Development of a book recommendation system to inspire “Infrequent Readers”. In: Tuamsuk, K., Jatowt, A., Rasmussen, E. (eds.) ICADL 2014. LNCS, vol. 8839, pp. 399–404. Springer, Heidelberg (2014). doi:10.1007/978-3-319-12823-8_43

    Google Scholar 

  5. Yada, S., Kageura, K.: Identification of Tweets that Mention Books: an experimental comparison of machine learning methods. In: Allen, R.B., Hunter, J., Zeng, M.L. (eds.) ICADL 2015. LNCS, vol. 9469, pp. 278–288. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27974-9_30

    Chapter  Google Scholar 

Download references

Acknowledgement

This work was supported by JSPS KAKENHI Grant Number JP 16K12542.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuntaro Yada .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Yada, S., Kageura, K. (2016). Improved Identification of Tweets that Mention Books: Selection of Effective Features. In: Morishima, A., Rauber, A., Liew, C. (eds) Digital Libraries: Knowledge, Information, and Data in an Open Access Society. ICADL 2016. Lecture Notes in Computer Science(), vol 10075. Springer, Cham. https://doi.org/10.1007/978-3-319-49304-6_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49304-6_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49303-9

  • Online ISBN: 978-3-319-49304-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics