Advertisement

All Domain Hidden Web Exposer Ontologies: A Unified Approach for Excavating the Web to Unhide Deep Web

  • Manpreet Singh SehgalEmail author
  • Jay Shankar Prasad
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 851)

Abstract

Knowledge experts have stored vast knowledge in various kinds of repositories. Paper repositories are being digitized and most of them have already been digitized under initiatives taken by government agencies from time to time. Various search engines were designed and implemented to search and index such information from web pages. For the information inside of the databases, search query interfaces have been designed. One is required to fill into these search interfaces the criteria specific to their query. This route to get to the database-driven data is a time consuming process as one needs to know the address of such query interfaces. This paper deals with this issue by proposing a domain independent algorithm for getting these interfaces automatically filled up resulting in an improved performance and nice digital experience to the knowledge seeker. The proposed algorithm is evaluated for the performance metrics and encouraging result has been obtained.

Keywords

Hidden web Ontologies Information extraction Domain independent 

References

  1. 1.
    Bergman, M.K.: The Deep Web: Surfacing Hidden Value (2000)Google Scholar
  2. 2.
    Lawrence, S., Giles, C.L.: Searching the World Wide Web. Science 280(5360), 98–100 (1998)CrossRefGoogle Scholar
  3. 3.
    Lawrence, S., Giles, C.L.: Accessibility of information on the web. Nature 400, 107 (1999).  https://doi.org/10.1038/21987CrossRefGoogle Scholar
  4. 4.
    Sehgal, M., Anuradha.: HWPDE: novel approach for data extraction from structured web pages. Int. J. Comput. Appl. (0975–8887), 50(8), 22–27 (2012)Google Scholar
  5. 5.
    Liu, B., Grossman, R., Zhai, Y.: Mining data records in web pages. In KDD 03: Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601–606 (2003)Google Scholar
  6. 6.
    Cai, D., Yu, S., Wen, J.R., Ma, W.Y.: VIPS: a Vision-based page segmentation algorithm. Microsoft Tech. Rep. MSR-TR-2003-79 (2003)Google Scholar
  7. 7.
    Anuradha, Sharma, A.K.: A novel technique for data extraction from hidden web databases. Int. J. Comput. Appl. 15(4), 45–48 (2011)Google Scholar
  8. 8.
    Wang, Y., Hu, J.: A machine learning based approach for table detection on the web. In: Proceedings of the 11th International Conference on World Wide Web, pp. 242–250 (2002)Google Scholar
  9. 9.
    Yildiz, B., Miksch, S.: OntoX—a method for ontology-driven information extraction. In: Proceedings of the International Conference on Computational Science and its Applications, pp. 660–673 (2007)Google Scholar
  10. 10.
    McDowell, L., Cafarella, M.J.: Ontology-driven information extraction with OntoSyphon. In Proceedings of the 5th International Semantic Web Conference, pp. 428–444 (2006)Google Scholar
  11. 11.
    Hwang, C.: Incompletely and imprecisely speaking: using dynamic ontologies for representing and retrieving information. In: Proceedings of the 6th International Workshop on Knowledge Representation Meets Databases, pp. 29–30 (1999)Google Scholar
  12. 12.
    Sivakumar, P.: Effectual web content mining using noise removal from web pages. Wirel. Pers Commun. 84–99 (2015)Google Scholar
  13. 13.
    Yan, H., Gong, Z., Zhang, N., Huang, T., Zhong, H., Wei, J.: Crawling Hidden Objects with kNN Queries. IEEE Trans. Knowl. Data Eng. 28(4), 912–924 (2016)Google Scholar
  14. 14.
    Song, D., Luo, Y., Heflin, J.: Linking heterogeneous data in the semantic web using scalable and domain-independent candidate selection. IEEE Trans. Knowl. Data Eng. 29(1), 143–156 (2017)CrossRefGoogle Scholar
  15. 15.
    Schafer, R.: Accurate and efficient general-purpose boilerplate detection for crawled web corpora. Lang Resour. Eval. 51(3), 873–889 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.MVN UniversityPalwalIndia

Personalised recommendations