Abstract
With the spreading of the Internet, information about our daily life and our residential region is becoming to be more and more active on the WWW (World Wide Web). That’s to say, there are a lot of Web pages, whose content is ‘local’ and may only interest residents of a narrow region. The conventional information retrieval systems and search engines, such as Google[1], Yahoo[2], etc., are very useful to help users finding interesting information. However, it’s not yet easy to find or exclude ‘local’ information about our daily life and residential region. In this paper, we propose a localness-filter for searched Web pages, which can discover and exclude information about our daily life and residential region from the searched Web pages. We compute the localness degree of a Web page by 1) estimating its region dependence: the frequency of geographical words and the content coverage of this Web page, and 2) estimating the ubiquitousness of its topic: in other words, we estimate if it is usual information that appears everyday and everywhere in our daily life.
This research is partly supported by the Japanese Ministry of Education, Culture, sports, Science and Technology under Grant-in-Aid for Scientific Research on “New Web Retrieval Services Based on Discovery of Web Semantic Structures”, No. 14019048, and “Multimodal Information Retrieval, Presentation, and Generation of Broadcast Contents for Mobile Environments”, No. 14208036.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Google. http://www.google.com/, 2002.
Yahoo! Japan. http://www.yahoo.co.jp/, 2002.
Qiang Ma, Kazutoshi Sumiya, and Katsumi Tanaka. Information filtering based on time-series features for data dissemination systems (in Japanese). IPSJ TOD7, 41(SIG6(TOD7)):46–57, 2000.
Qiang Ma, Shinya Miyazaki, and Katsumi Tanaka. Webscan: Discovering and notifying important changes of web sites. proc. of DEXA2001, LNCS 2113, pages 587–598, 2001.
Shinya Miyazaki, Qiang Ma, and Katsumi Tanaka. Webscan: Content-based change discovery and broadcast-notification of web sites (in Japanese). IPSJ TOD10, 42(SIG8(TOD10)):96–107, 2001.
Yahoo!regional. http://local.yahoo.co.jp/, 2002.
MACHIgoo. http://machi.goo.ne.jp/, 2002.
Chiyako Matsumoto, Ma Qiang, and Katsumi Tanaka. Web information retrieval based on the localness degree. proc. of DEXA 2002, LNCS 2453, pages 172–181, 2002.
Antonin Guttman. R-trees: A dynamic index structure for spatial searching. Proc. ACM SIGMOD Conference on Management of Data, 14(2):47–57, 1984.
Carlo Zaniolo, Stefano Ceri, Christos Faloutsos, Richard T. Snodgrass, V. S. Subrahmanian, and Roberto Zicari. Advanced Database Systems. The Morgan Kaufmann, 1997.
Nobuyuki Miura, Katsumi Takahashi, Seiji Yokoji, and Kenichi Shima. Location oriented information integration — mobile info search 2 experiment — (in Japanese). The 57th National Convention of IPSJ, 3: 637–638, 1998.
KOKONONET. http://www.kokono.net/, 2002.
Orkut Buyukkokten, Junghoo Cho, Hector Garcia-Molina, Luis Gravano, and Narayanan Shivakumar. Exploiting geographical location information of web pages. proc. of WebDB (Informal Proceedings), pages 91–96, 1999.
Daniel Egnor. Google programing contest, 2002.
T. Takeda. The latitude / longitude position database of all-prefectures cities, towns and villages in japan, 2000.
Google Web API. http://www.google.com/apis/, 2002.
Microsoft.net. http://www.microsoft.com/net/, 2002.
Chasen. http://chasen.aist-nara.ac.jp/chasen/whatis.html.en, 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ma, Q., Matsumoto, C., Tanaka, K. (2003). A Localness-Filter for Searched Web Pages. In: Zhou, X., Orlowska, M.E., Zhang, Y. (eds) Web Technologies and Applications. APWeb 2003. Lecture Notes in Computer Science, vol 2642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36901-5_53
Download citation
DOI: https://doi.org/10.1007/3-540-36901-5_53
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-02354-8
Online ISBN: 978-3-540-36901-1
eBook Packages: Springer Book Archive