Skip to main content

FCHC: A Social Semantic Focused Crawler

  • Conference paper
Advances in Computing and Communications (ACC 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 191))

Included in the following conference series:

Abstract

The World Wide Web is a huge collection of web pages where every second, new piece of information is added. Searching and retrieving relevant web resources is a protracted task and finding relevant resources w.r.t. some topic, without any explicit or implicit feedback adds more intricacy to the process. Focused crawling in such scenarios provides a better alternate to generic crawling especially when topic specific or personalized information is required. This paper presents a crawling approach FCHC that uses human cognition for focused crawl on a social bookmarking site initiated with the seeds retrieved from a search engine. It utilizes social bookmark tags as implicit feedback to compute eResource relevance and Vector Space Model to rank the retrieved eResources. A well established metric called harvest ratio is used to compare the results of the proposed approach with the semantic focused crawler and the classic focused crawler. The analysis of the results shows a better performance of social semantic focused crawlers over the semantic and classic focused crawlers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chakrabarti, S., Berg, M., Dom, B.: Focused crawling: a new approach to topic-specific web resource discovery. Computer Networks 31, 1623–1640 (1999)

    Article  Google Scholar 

  2. Batsakisa, S., Petrakisa, G.M.E., Milios, E.: Improving the performance of focused web crawlers. Data & Knowledge Engineering 68, 1001–1013 (2009)

    Article  Google Scholar 

  3. Pant, G., Srinivasan, P. and Menczer, F.: Crawling the web. s.l. : Springer-Verlag, 2004, Web Dynamics: Adapting to Change in Content, Size, Topology and Use(2004).

    Google Scholar 

  4. Bedi, P., Banati, H., Thukral, A.: Social semantic retrieval and ranking of eResources. In: ACEEE 2010: Second International Conference on Advances in Recent Technologies in Communication and Computing, Kerela, India, pp. 343–347 (2010)

    Google Scholar 

  5. Diligenti, M., et al.: Focused crawling using context graphs. In: 26th International Conference on Very Large Databases, Cairo, Egypt, pp. 527–534 (2000)

    Google Scholar 

  6. Greg, P., Chowdhury, A., Torgeson, C.: A picture of search. Hong Kong Spink. In: 1st International Conference on Scalable Information Systems (2006)

    Google Scholar 

  7. Bischoff, K., et al.: Can all tags be used for search? In: 17th ACM Conference on Information and Knowledge Management, pp. 193–202. ACM, New York (2008)

    Google Scholar 

  8. Firan, C.S., Nejdl, W., Paiu, R.: The benefit of using tag-based profiles. In: LA-WEB (2007)

    Google Scholar 

  9. Agrahri, A.K., Anand, T.M.D., Riedl, J.: Can people collaborate to improve the relevance of search results? Switzerland : Lausanne, 2008. In: 2nd ACM International Conference on Recommender Systems, October 23-25, pp. 283–286 (2008)

    Google Scholar 

  10. Bao, S., et al.: Optimizing web search using social annotation. In: 16th International Conference on World Wide Web. ACM, Banff (2007)

    Google Scholar 

  11. Chen, S.-Y., Yi, Z.: Improve web search ranking with social tagging. In: 1st International Workshop on Mining Social media and 13th Conference of the Spanish Association for Artificial Intelligence, Sevilla, Spain (2009)

    Google Scholar 

  12. Ehrig, M., Maedche, A.: Ontology-focused crawling of web documents. In: Symposium on Applied Computing. ACM, Melbourn (2003)

    Google Scholar 

  13. Menczer, F., et al.: Evaluating topic-driven web crawlers. In: 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 241–249. ACM, New York (2001)

    Google Scholar 

  14. Zheng, H.-T., Kang, B.-Y., Kim, H.-G.: Learnable Focused Crawling Based on Ontology. In: Li, H., Liu, T., Ma, W.-Y., Sakai, T., Wong, K.-F., Zhou, G. (eds.) AIRS 2008. LNCS, vol. 4993, pp. 264–275. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thukral, A., Mendiratta, V., Behl, A., Banati, H., Bedi, P. (2011). FCHC: A Social Semantic Focused Crawler. In: Abraham, A., Lloret Mauri, J., Buford, J.F., Suzuki, J., Thampi, S.M. (eds) Advances in Computing and Communications. ACC 2011. Communications in Computer and Information Science, vol 191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22714-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22714-1_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22713-4

  • Online ISBN: 978-3-642-22714-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics