Skip to main content

SemCrawl: Framework for Crawling Ontology Annotated Web Documents for Intelligent Information Retrieval

  • Conference paper

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 321))

Abstract

Web is considered as the largest information pool and search engine, a tool for extracting information from web, but due to unorganized structure of the web it is getting difficult to use search engine tool for finding relevant information from the web. Future search engine tools will not be based merely on keyword search, whereas they will be able to interpret the meaning of the web contents to produce relevant results. Design of such tools requires extracting information from the contents which supports logic and inferential capability. This paper discusses the conceptual differences between the traditional web and semantic web, specifying the need for crawling semantic web documents. In this paper a framework is proposed for crawling the ontologies/semantic web documents. The proposed framework is implemented and validated on different collection of web pages. This system has features of extracting heterogeneous documents from the web, filtering the ontology annotated web pages and extracting triples from them which supports better inferential capability.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berners-Lee, T., Hendler, J., Ora, L.: The Semantic Web. Scientific American 284(5), 34–43 (2001)

    Article  Google Scholar 

  2. Biddulph, M.: Crawling the Semantic Web. BBC London, United Kingdom (2003)

    Google Scholar 

  3. McBride, B.: Jena implementing the RDF Model and Syntax Specification. Hewlett Packard laboratories, Bristol, UK (2000)

    Google Scholar 

  4. DARPA Agent Markup Language (2012), http://www.daml.org/language/

  5. Dhingra, V., Bhatia, K.K.: Towards Intelligent Information retrieval on Web. IJCSE (2011)

    Google Scholar 

  6. Dhingra, V., Bhatia, K.K.: Metadata: Towards Machine-Enabled Intelligence. IJWesT 3(3), 121–130 (2012)

    Article  Google Scholar 

  7. Dodds, L.: Slug: A Semantic Web Crawler in Jena User Conference Bristol, UK (2006)

    Google Scholar 

  8. Li, D., et al.: Swoogle: A search and metadata engine for the semantic web. In: Proceedings of 13th ACM Conference on Information and Knowledge Management (2004)

    Google Scholar 

  9. Dong, H., Hussain, F.K., Chang, E.: A semantic crawler based on an extended CBR algorithm. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2008 Workshops. LNCS, vol. 5333, pp. 1076–1085. Springer, Heidelberg (2008)

    Google Scholar 

  10. Asunción, G.-P., Corcho, O.: Ontology Languages for the Semantic Web. IEEE Intelligent Systems Journal (2002)

    Google Scholar 

  11. Andreas, H., Hannes, G.: On Searching and Displaying RDF Data from the Web. In: 2nd European Semantic Web Conference (ESWC 2005), Heraklion, Greece (2005)

    Google Scholar 

  12. Hendler, J., Berners-Lee, T.: From Semantic Web to Social Machine. Artificial Intelligence 174(2), 156–161 (2010)

    Article  MathSciNet  Google Scholar 

  13. Vishal, J.: Ontology Based Information Retrieval in Semantic Web. International Journal of Information technology and Computer Sceince, 62–69 (2013)

    Google Scholar 

  14. Annett, M., Ronny, W., Klaus: Searching Community-built Semantic Web Resources to Support Personal Annotation. In: Proceedings of Bridging the Gap between Semantic Web and Web 2.0, Austria (2007)

    Google Scholar 

  15. Van de Maele, F., Spyns, P., Meersman, R.: An Ontology-Based Crawler for the Semantic Web. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2008 Workshops. LNCS, vol. 5333, pp. 1056–1065. Springer, Heidelberg (2008)

    Google Scholar 

  16. Staab, S., Apsitis, K., Handschuh, S., Oppermann, H.: Specification of an RDF Crawler (2004)

    Google Scholar 

  17. World Wide Consortium RDF Primer, http://www.w3.org/TR/rdf-primer/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vandana Dhingra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Dhingra, V., Bhatia, K.K. (2015). SemCrawl: Framework for Crawling Ontology Annotated Web Documents for Intelligent Information Retrieval. In: Buyya, R., Thampi, S. (eds) Intelligent Distributed Computing. Advances in Intelligent Systems and Computing, vol 321. Springer, Cham. https://doi.org/10.1007/978-3-319-11227-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11227-5_19

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11226-8

  • Online ISBN: 978-3-319-11227-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics