Skip to main content

Analysing Structured Scholarly Data Embedded in Web Pages

  • Conference paper
  • First Online:
Semantics, Analytics, Visualization. Enhancing Scholarly Data (SAVE-SD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9792))

Included in the following conference series:

Abstract

Web pages increasingly embed structured data in the form of microdata, microformats and RDFa. Through efforts such as schema.org, such embedded markup have become prevalent, with current studies estimating an adoption by about 26% of all web pages. Similar to the early adoption of Linked Data principles by publishers, libraries and other providers of bibliographic data, such organisations have been among the early adopters, providing an unprecedented source of structured data about scholarly works. Such data, however, is fundamentally different from traditional Linked Data, by being very sparsely linked and consisting of a large amount of coreferences and redundant statements. So far, the scale and nature of embedded scholarly data on the Web has not been investigated. In this work, we provide a study on embedded scholarly data to answer research questions about the depth, syntactic and semantic characteristics and distribution of extracted data, thereby investigating challenges and opportunities for using embedded data as a structured knowledge graph of scholarly information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    RDFa W3C recommendation: http://www.w3.org/TR/xhtml-rdfa-primer/.

  2. 2.

    http://www.w3.org/TR/microdata.

  3. 3.

    http://microformats.org.

  4. 4.

    http://schema.org.

  5. 5.

    http://www.webdatacommons.org.

  6. 6.

    http://webdatacommons.org/structureddata/2014-12/stats/stats.html.

  7. 7.

    https://en.wikipedia.org/wiki/DBpedia.

  8. 8.

    https://en.wikipedia.org/wiki/Freebase.

References

  1. Bizer, C., Eckert, K., Meusel, R., Mühleisen, H., Schuhmacher, M., Völker, J.: Deployment of RDFa, microdata, and microformats on the web – a quantitative analysis. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 17–32. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41338-4_2

    Chapter  Google Scholar 

  2. Dietze, S., Taibi, D., dAquin, M.: Facilitating scientometrics in learning analytics and educational data mining the LAK dataset. Seman. Web J. (2015)

    Google Scholar 

  3. Meusel, R., Petrovski, P., Bizer, C.: The webdatacommons microdata, RDFa and microformat dataset series. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 277–292. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_18

    Google Scholar 

  4. Meusel, R., Paulheim, H.: Heuristics for fixing common errors in deployed schema.org microdata. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 152–168. Springer, Cham (2015). doi:10.1007/978-3-319-18818-8_10

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Dietze .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Sahoo, P., Gadiraju, U., Yu, R., Saha, S., Dietze, S. (2016). Analysing Structured Scholarly Data Embedded in Web Pages. In: González-Beltrán, A., Osborne, F., Peroni, S. (eds) Semantics, Analytics, Visualization. Enhancing Scholarly Data. SAVE-SD 2016. Lecture Notes in Computer Science(), vol 9792. Springer, Cham. https://doi.org/10.1007/978-3-319-53637-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-53637-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-53636-1

  • Online ISBN: 978-3-319-53637-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics