Abstract
We define the peculiarity of text as a metric of information credibility. Higher peculiarity means lower credibility. We extract the theme word and the characteristic words from text and check whether there is a subject-description relation between them. The peculiarity is defined using the ratio of the subject-description relation between a theme word and characteristic words. We evaluate the extent to which peculiarity can be used to judge by classifying text from Wikipedia and Uncyclopedia in terms of the peculiarity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth international conference on Very large data bases (VLDB 2004), VLDB Endowment, pp. 576–587 (2004)
Yamamoto, Y., Tanaka, K.: Finding comparative facts and aspects for judging the credibility of uncertain facts. In: Vossen, G., Long, D.D.E., Yu, J.X. (eds.) WISE 2009. LNCS, vol. 5802, pp. 291–305. Springer, Heidelberg (2009)
Nakagawa, H., Yumoto, H., Mori, T.: Term extraction based on occurrence and concatenation frequency (in Japanese). Journal of natural language processing 10(1), 27–45 (2003)
Oyama, S., Tanaka, K.: Query modification by discovering topics from web page structures. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds.) APWeb 2004. LNCS, vol. 3007, pp. 553–564. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nakabayashi, T., Yumoto, T., Nii, M., Takahashi, Y., Sumiya, K. (2010). Measuring Peculiarity of Text Using Relation between Words on the Web. In: Chowdhury, G., Koo, C., Hunter, J. (eds) The Role of Digital Libraries in a Time of Global Change. ICADL 2010. Lecture Notes in Computer Science, vol 6102. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13654-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-13654-2_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13653-5
Online ISBN: 978-3-642-13654-2
eBook Packages: Computer ScienceComputer Science (R0)