Skip to main content

Finding, Extracting, and Building Academic Linked Data

  • Conference paper
  • First Online:
Semantic Web and Web Science

Part of the book series: Springer Proceedings in Complexity ((SPCOM))

  • 1785 Accesses

Abstract

This paper addresses the problem of finding and extracting academic information from conference Web pages, then organizing academic information as ontologies, and finally generating academic linked data by matching these ontologies. The main contributions include (1) a topic-crawling method and lightweight crawling method based on search engine is presented. Crawling seeds, relevant websites filter, and crawling update strategy are discussed. (2) A new vision-based approach for extracting academic information is proposed. It first segments Web pages into text blocks and then classifies these text blocks into predefined categories. The initial classification results are improved by post-processing. Finally, academic information is extracted from the classified text blocks. (3) A global ontology is used to describe the background domain knowledge, and then the extracted academic information of each website is organized as local ontologies. Finally, academic linked data is generated by matching all local ontologies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bizera, C., Lehmannb, J., Kobilarova, G., et al.: DBpedia – a crystallization point for the Web of Data. J. Web Semant. 7, 154–165 (2009)

    Article  Google Scholar 

  2. Tang, J., Zhang, J., Yao, L., Li, J., et al.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV (2008)

    Google Scholar 

  3. Tang, J., Zhang, D., Yao, L.: Social network extraction of academic researchers. In: Proceedings of 2007 IEEE International Conference on Data Mining, Omaha, NE (2007)

    Google Scholar 

  4. Chang, C.-H., Kayed, M., Girgis, M.R., Shaalan, K.: A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18, 1411–1428 (2006)

    Article  Google Scholar 

  5. Laender, A., Ribeiro-Neto, B.A., da Silva, A.S., Teixeira, J.S.: A brief survey of web data extraction tools. SIGMOD Record 31, 84–93 (2002)

    Article  Google Scholar 

  6. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, CA (1998)

    Google Scholar 

  7. Flake, G.W., Lawrence, S., Lee Giles, C., Coetzee, F.M.: Self-organization and identification of web communities. IEEE Comp. 35, 66–71 (2002)

    Article  Google Scholar 

  8. Cai, D., Yu, S., Wen, J.-R., Ma, W.-Y.: VIPS: a vision-based page segmentation algorithm. Microsoft Technical Report (2003)

    Google Scholar 

  9. Liu, W., Meng, X., Meng, W.: ViDE: a vision-based approach for deep web data extraction. IEEE Trans. Knowl. Data Eng. 22, 447–460 (2010)

    Article  Google Scholar 

  10. Wang, P., Xu, B.: Lily: ontology alignment results for OAEI 2009. In: The 4th International Workshop on Ontology Matching (OM2009), Washington, DC (2009)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the NSF of China (61003156 and 61003055) and the Natural Science Foundation of Jiangsu Province (BK2009136 and BK2011335).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this paper

Cite this paper

Wang, P., Zhang, X. (2013). Finding, Extracting, and Building Academic Linked Data. In: Li, J., Qi, G., Zhao, D., Nejdl, W., Zheng, HT. (eds) Semantic Web and Web Science. Springer Proceedings in Complexity. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6880-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-6880-6_3

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-6879-0

  • Online ISBN: 978-1-4614-6880-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics