Measuring Peculiarity of Text Using Relation between Words on the Web

Nakabayashi, Takeru; Yumoto, Takayuki; Nii, Manabu; Takahashi, Yutaka; Sumiya, Kazutoshi

doi:10.1007/978-3-642-13654-2_13

Takeru Nakabayashi¹⁹,
Takayuki Yumoto²⁰,
Manabu Nii²⁰,
Yutaka Takahashi²⁰ &
…
Kazutoshi Sumiya²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6102))

Included in the following conference series:

International Conference on Asian Digital Libraries

1417 Accesses

Abstract

We define the peculiarity of text as a metric of information credibility. Higher peculiarity means lower credibility. We extract the theme word and the characteristic words from text and check whether there is a subject-description relation between them. The peculiarity is defined using the ratio of the subject-description relation between a theme word and characteristic words. We evaluate the extent to which peculiarity can be used to judge by classifying text from Wikipedia and Uncyclopedia in terms of the peculiarity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth international conference on Very large data bases (VLDB 2004), VLDB Endowment, pp. 576–587 (2004)
Google Scholar
Yamamoto, Y., Tanaka, K.: Finding comparative facts and aspects for judging the credibility of uncertain facts. In: Vossen, G., Long, D.D.E., Yu, J.X. (eds.) WISE 2009. LNCS, vol. 5802, pp. 291–305. Springer, Heidelberg (2009)
Chapter Google Scholar
Nakagawa, H., Yumoto, H., Mori, T.: Term extraction based on occurrence and concatenation frequency (in Japanese). Journal of natural language processing 10(1), 27–45 (2003)
Google Scholar
Oyama, S., Tanaka, K.: Query modification by discovering topics from web page structures. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds.) APWeb 2004. LNCS, vol. 3007, pp. 553–564. Springer, Heidelberg (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering, University of Hyogo, 2167 Shosha, Himeji, Hyogo, 671-2280, Japan
Takeru Nakabayashi
Graduate School of Engineering, University of Hyogo, 2167 Shosha, Himeji, Hyogo, 671-2280, Japan
Takayuki Yumoto, Manabu Nii & Yutaka Takahashi
School of Human Science and Environment, University of Hyogo, 1-1-12 Shinzaike-honcho, Himeji, Hyogo, 670-0092, Japan
Kazutoshi Sumiya

Authors

Takeru Nakabayashi
View author publications
You can also search for this author in PubMed Google Scholar
Takayuki Yumoto
View author publications
You can also search for this author in PubMed Google Scholar
Manabu Nii
View author publications
You can also search for this author in PubMed Google Scholar
Yutaka Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Kazutoshi Sumiya
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Technology, Sydney, PO Box 123, 2007, Broadway, NSW, Australia
Gobinda Chowdhury
Nanyang Technological University, 31 Nanyang Link, 637718, Singapore
Chris Koo
The University of Queensland, Brisbane, QLD 4072, Australia
Jane Hunter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakabayashi, T., Yumoto, T., Nii, M., Takahashi, Y., Sumiya, K. (2010). Measuring Peculiarity of Text Using Relation between Words on the Web. In: Chowdhury, G., Koo, C., Hunter, J. (eds) The Role of Digital Libraries in a Time of Global Change. ICADL 2010. Lecture Notes in Computer Science, vol 6102. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13654-2_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-13654-2_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13653-5
Online ISBN: 978-3-642-13654-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics