Skip to main content

Computing Geographical Serving Area Based on Search Logs and Website Categorization

  • Conference paper
Database and Expert Systems Applications (DEXA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4653))

Included in the following conference series:

  • 1220 Accesses

Abstract

Knowing the geographical serving area of web resources is very important for many web applications. Here serving area stands for the geographical distribution of online users who are interested in a given web site. In this paper, we proposed a set of novel methods to detect the serving area of web resources by analyzing search engine logs. We use the search logs to detect serving area in two ways. First, we extracted the user IP locations to generate the geographical distribution of users who had the same interests in a web site. Second, query terms input by users were considered as the user knowledge about a web site. To increase the confidence and to cover new sites for use in real-time applications, we also proposed a categorization system for local web sites. A novel method for detecting the serving area was proposed based on categorizing the web content. For each category, a radius was assigned according to previous logs. In our experiments, we evaluated all these three algorithms. From the results, we found that the approach based on query terms was superior to that based on IP locations, since search queries for local sites tended to include location words while the IP locations were sometimes erroneous. The approach based on categorization was efficient for sites of known categories and were useful for small sites without sufficient number of query logs.

This work was done when the first author was visiting Microsoft Research Asia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ding, J., Gravano, L., Shivakumar, N.: Computing geographical scopes of web resource. In: 26th International Conference on Very Large Data Bases (VLDB 2000), Cairo, Egypt (September 2000)

    Google Scholar 

  2. Buyukkokten, O., Cho, J., Garcia-Molina, H., Gravano, L., Shivakumar, N.: Exploiting geographical location information of web pages. In: ACM SIGMOD Workshop on the Web and Databases 1999 (WebDB 1999), Philadelphia (June 1999)

    Google Scholar 

  3. Yokoji, S., Takahashi, K., Miura, N.: Kokono search: a location based search engine. In: 10th International World Wide Web Conference (WWW 2001), Hong Kong (May 2001)

    Google Scholar 

  4. Kosala, R., Blocakeel, H.: Web mining research: a survey. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2000), Boston (August 2000)

    Google Scholar 

  5. Amitay, E., Har’El, N., Sivan, R., Soffer, A.: Web-a-where: geotagging web content. In: Proceedings of the 27th SIGIR, pp. 273–280 (2004)

    Google Scholar 

  6. Wang, C., Xie, X., Wang, L., Lu, Y., Ma, W.-Y.: Detecting Geographic Locations from Web Resources. In: The 2nd Internatinal Workshop on Geographic Information Retrieval (GIR 2005), ACM Fourteenth Conference on Information and Knowledge Management (CIKM 2005), Bremen (October 2005)

    Google Scholar 

  7. Dumais, S., Chen, H.: Hierarchical classification of web content. In: Proceeding of SIGIR 2000, 23rd ACM International Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 256–263. ACM Press, New York (2000)

    Chapter  Google Scholar 

  8. Glover, E.J., Tsioutsiouliklis, K., Lawrence, S., Pennock, D.M., Flake, G.W.: Using web structure for classifying and describing web pages. In: Proceedings of the Eleventh International Conference on World Wide Web, pp. 562–569. ACM Press, New York (2002)

    Chapter  Google Scholar 

  9. Yang, Y., Slattery, S., Ghani, R.: A study of approaches to hypertext categorization. Journal of Intelligent Information Systems

    Google Scholar 

  10. Gravano, L., Hatzivassiloglou, V., Lichtenstein, R.: Categorizing web queries according to geographical locality. In: 12th ACM Conference on Information and Knowledge Management (CIKM 2003), New Orleans (November 2003)

    Google Scholar 

  11. CITY-DATA.COM. http://www.city-data.com

  12. Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)

    Article  Google Scholar 

  13. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  14. Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of data (1998)

    Google Scholar 

  15. Hearst, M.A.: Trends and controversies: support vector machines. IEEE Intelligent Systems 13(4), 18–28 (1998)

    Article  Google Scholar 

  16. Hill, L.L., Frew, J., Zheng, Q.: Geographic names: the implementation of a gazetteer in a georeferenced digital library. Digital Library, 5(1) (January 1999)

    Google Scholar 

  17. Iko, P., Takahiko, S., Katsumi, T., Masaru, K.: User behavior analysis of location aware search engine. In: 3rd International Conference on Mobile Data Management (MDM 2002), Singapore (January 2002)

    Google Scholar 

  18. McCurley, K.S.: Geographical mapping and navigation of the web. In: 10th International World Wide Web Conference (WWW 2001), Hong Kong (May 2001)

    Google Scholar 

  19. Google Local Search. http://www.google.com/local

  20. MSN Local Search. http://search.msn.com/local

  21. Geographic Names Information System (GNIS). http://geonames.usgs.gov/

  22. North American Numbering Plan. http://sd.wareonearth.com/~phil/npanxx

  23. USPS – The United States Postal Services. http://www.usps.com

  24. Open Directory Project. http://dmoz.org/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland Wagner Norman Revell Günther Pernul

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Q., Xie, X., Wang, L., Yue, L., Ma, WY. (2007). Computing Geographical Serving Area Based on Search Logs and Website Categorization. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_79

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74469-6_79

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74467-2

  • Online ISBN: 978-3-540-74469-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics