Skip to main content

Challenges, Approaches and Solutions in Data Integration for Research and Innovation

  • Chapter
Springer Handbook of Science and Technology Indicators

Abstract

In order to be implemented by policy makers, science, technology, and innovation ( ) policies and indicator building need data. Whenever we need data, we need a method for data management, and in the era of big data , a crucial role is played by data integration . Therefore, STI policies and indicator development need data integration. Two main approaches to data integration exist, namely procedural and declarative. In this chapter, we follow the latter approach and focus our attention on the ontology-based data integration ( ) paradigm. The main principles of OBDI are:

  1. (i)

    Leave the data where they are.

  2. (ii)

    Build a conceptual specification of the domain of interest (ontology), in terms of knowledge structures.

  3. (iii)

    Map such knowledge structures to concrete data sources.

  4. (iv)

    Express all services over the abstract representation.

  5. (v)

    Automatically translate knowledge services to data services.

We introduce the main challenges of data integration for research and innovation ( ) and show that reasoning over an ontology connected to data may be very helpful for the study of R&I. We also provide examples by using Sapientia, an ontology specifically defined for multidimensional research assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • J. Chen, Y. Chen, X. Du, C. Li, J. Lu, S. Zhao, X. Zhou: Big data challenge: a data management perspective, Front. Comput. Sci. 7(2), 157–164 (2013)

    Article  Google Scholar 

  • H. Ekbia, M. Mattioli, I. Kouper, G. Arave, A. Ghazinejad, T. Bowman, V. Ratandeep Suri, A. Tsou, S. Weingart, C.R. Sugimoto: Big data, bigger dilemmas: A critical review, J. Assoc. Inf. Sci. Technol. 66(8), 1523–1545 (2015)

    Article  Google Scholar 

  • C.L. Borgman: Big Data, Little Data, No Data: Scholarship in the Networked World (MIT Press, Cambridge 2015)

    Book  Google Scholar 

  • Z. Majkić: Big Data Integration Theory, Texts in Computer Science (Springer, Switzerland 2014)

    Book  Google Scholar 

  • X.L. Dong, D. Srivastava: Big data integration, Synth. Lect. Data Manag. 7(1), 1–198 (2015)

    Article  Google Scholar 

  • M. Lenzerini: Data integration: A theoretical perspective. In: Proc. 21st ACM-SIGMOD-SIGART Symp. Princ. Database Syst. PODS2002 (2002) pp. 233–246

    Google Scholar 

  • C. Parent, S. Spaccapietra: Database integration: the key to data interoperability. In: Advances in Object-Oriented Data Modeling, ed. by M.P. Papazoglou, Z. Zari (MIT Press, Cambridge 2000) pp. 221–253

    Google Scholar 

  • C. Daraio: A framework for the assessment of research and its impacts, J. Data Inf. Sci. 2(4), 7–42 (2017)

    Google Scholar 

  • C. Daraio, W. Glänzel: Grand challenges in data integration—state of the art and future perspectives: An introduction, Scientometrics 108(1), 391–400 (2016)

    Article  Google Scholar 

  • OECD: Quality Framework and Guidelines for OECD Statistical Activities (OECD, Paris 2011)

    Google Scholar 

  • W. Glänzel, S. Katz, H. Moed, U. Schoepflin: Preface, Scientometrics 35(2), 165–166 (1996)

    Article  Google Scholar 

  • W. Glänzel, H. Willems: Towards standardisation, harmonisation and integration of data from heterogeneous sources for funding and evaluation purposes, Scientometrics 106(2), 821–823 (2016)

    Article  Google Scholar 

  • W. Glänzel: The need for standards in bibliometric research and technology, Scientometrics 35(2), 167–176 (1996)

    Article  Google Scholar 

  • G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, R. Rosati: Using ontologies for semantic data integration. In: A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years, Studies in Big Data, Vol. 31, ed. by S. Flesca, S. Greco, E. Masciari, D. Saccà (Springer, Cham 2018)

    Google Scholar 

  • C. Daraio, M. Lenzerini, C. Leporelli, P. Naggar, E. Fusco, A. Balducci: Sapientia (the ontology of multidimensional research assessment) and OBDM (ontology based data management) as two key enabling technologies for the development of integrated data platforms for science, technology and innovation (STI). In: OECD Blue Sky 2016, Ghent (2016)

    Google Scholar 

  • J.D. Ullman: Information integration using logical views. In: Proc. Int. Conf. Database Theor., ICDT'97, LNCS, Vol. 1186 (Springer, Berlin, Heidelberg 1997) pp. 19–40

    Chapter  Google Scholar 

  • A.Y. Levy, A.O. Mendelzon, Y. Sagiv, D. Srivastava: Answering queries using views. In: Proc. 14th ACM-SIGMOD-SIGART Symp. Princ. Database Syst., PODS'95 (1995) pp. 95–104

    Google Scholar 

  • A.Y. Halevy, A. Rajaraman, J. Ordille: Data integration: The teenage years. In: Proc. 32nd Int. Conf. Very Large Data Bases, VLDB 2006 (2006) pp. 9–16

    Google Scholar 

  • N.F. Noy, A. Doan, A.Y. Halevy: Semantic integration (editorial), AI Magazine 26(1), 7 (2005)

    Google Scholar 

  • D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati, G. Vetere: DL-Lite: Practical reasoning for rich DLs. In: Proc. Int. Workshop Descr. Log., DL2004, CEUR, Vol. 104 (2004), http://ceur-ws.org

    Google Scholar 

  • D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati: Tractable reasoning and efficient query answering in description logics: The DL-Lite family, J. Autom. Reason. 39(3), 385–429 (2007)

    Article  Google Scholar 

  • A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati: Linking data to ontologies. In: J. Data Semant, Vol. 4900 (Springer, Berlin, Heidelberg 2008) pp. 133–173

    Google Scholar 

  • M. Lenzerini: Ontology-based data management. In: Proc. 20th ACM Int. Conf. Inf. Knowl. Manag., CIKM'11 (2011) pp. 5–6

    Google Scholar 

  • C. Daraio, M. Lenzerini, C. Leporelli, H.F. Moed, P. Naggar, A. Bonaccorsi, A. Bartolucci: Sapientia: the ontology of multi-dimensional research assessment. In: Proc. 15th Int. Soc. Scientometr. Informetr. Conf., Istanbul, ed. by A.A. Salah, Y. Tonta, A.A. Akdag Salah, C. Sugimoto, U. Al (Bogaziçi Univ. Printhouse, Turkey 2015) pp. 965–977

    Google Scholar 

  • F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P.F. Patel-Schneider (Eds.): The Description Logic Handbook: Theory, Implementation and Applications, 2nd edn. (Cambridge Univ. Press, Cambridge 2007)

    Google Scholar 

  • T. Imielinski, W. Lipski Jr.: Incomplete information in relational databases, J. ACM 31(4), 761–791 (1984)

    Article  Google Scholar 

  • S. Ceri, G. Gottlob, L. Tanca: Logic Programming and Databases (Springer, Berlin 1990)

    Book  Google Scholar 

  • R. Fagin, G.P. Kolaitis, R.J. Miller, L. Popa: Data exchange: Semantics and query answering, Theor. Comput. Sci. 336(1), 89–124 (2005)

    Article  Google Scholar 

  • P.N. Edwards, S.J. Jackson, M.K. Chalmers, G.C. Bowker, C.L. Borgman, D. Ribes, M. Burton, S. Calvert: Knowledge Infrastructures: Intellectual frameworks and research challenges (Deep Blue, Ann Arbor 2013), http://hdl.net/2027.42/97552

    Google Scholar 

  • N. Georgescu-Roegen: The economics of production, Am. Econ. Rev. 60(2), 1–9 (1970)

    Google Scholar 

  • N. Georgescu-Roegen: Process analysis and the neoclassical theory of production, Am. J. Agric. Econ. 54(2), 279–294 (1972)

    Article  Google Scholar 

  • N. Georgescu-Roegen: Methods in economic science, J. Econ. Issues 13(2), 317–328 (1979)

    Article  Google Scholar 

  • C. Daraio, M. Lenzerini, C. Leporelli, P. Naggar, A. Bonaccorsi, A. Bartolucci: The advantages of an ontology-based data management approach: Openness, interoperability and data quality, Scientometrics 108(1), 441–455 (2016)

    Article  Google Scholar 

  • X. Li, J.D. Johnson: Evaluate IT investment opportunities using real options theory, Inf. Resour. Manag. J. 15(3), 32–47 (2002)

    Article  Google Scholar 

  • C.Y. Baldwin, K. Clark: Design Rules – The Power of Modularity (MIT Press, Cambridge 2000)

    Book  Google Scholar 

  • D.L. Parnas: On the criteria to be used in decomposing systems into modules, Commun. ACM 15(12), 1053–1058 (1972)

    Article  Google Scholar 

  • H.A. Simon: The architecture of complexity, Proc. Am. Philos. Soc. 106, 467–482 (1962)

    Google Scholar 

  • D. Lembo, D. Pantaleone, V. Santarelli, D.F. Savo: Easy OWL drawing with the graphol visual ontology language. In: Proc. 15th Int. Conf. Princ. Knowl. Represent. Reason., KR2016 (2016) pp. 573–576

    Google Scholar 

  • D. Lembo, D. Pantaleone, V. Santarelli, D.F. Savo: Eddy: A graphical editor for OWL 2 ontologies. In: Proc. 25th Int. Jt. Conf. Artif. Intell., IJCAI (2016) pp. 4252–4253

    Google Scholar 

  • OECD: Frascati Manual 2015: Guidelines for Collecting and Reporting Data on Research and Experimental Development. In: The Measurement of Scientific, Technological and Innovation Activities (OECD, Paris 2015), https://doi.org/10.1787/9789264239012-en

    Chapter  Google Scholar 

  • REF Research Excellence Framework: Panel Criteria and Working Methods. https://www.ref.ac.uk/2014/media/ref/content/pub/panelcriteriaandworkingmethods/01_12_1.pdf (2012)

  • C. Daraio, M. Lenzerini, C. Leporelli, F.H. Moed, P. Naggar, A. Bonaccorsi, A. Bartolucci: Data integration for research and innovation policy: An ontology-based data management approach, Scientometrics 106(2), 857–871 (2016)

    Article  Google Scholar 

  • C. Daraio, A. Bonaccorsi: Beyond university rankings? Generating new indicators on universities by linking data in open platforms, J. Assoc. Inf. Sci. Technol. 68, 508–529 (2016)

    Article  Google Scholar 

  • B.M. Frischmann: Infrastructure: The Social Value of Shared Resources (Oxford Univ. Press, New York 2012)

    Book  Google Scholar 

  • OECD: Data-Driven Innovation Big Data for Growth and Well-Being (OECD, Paris 2015)

    Book  Google Scholar 

  • OECD: Making Open Science a Reality. In: OECD Science, Technology and Industry Policy Papers, Vol. 25 (OECD, Paris 2015), https://doi.org/10.1787/5jrs2f963zs1-en

    Chapter  Google Scholar 

Download references

Acknowledgements

Financial support from the Project Sapienza Awards 2015 n. C26H15XNFS and the Project FILAS RU 2014-1186 is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maurizio Lenzerini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Cite this chapter

Lenzerini, M., Daraio, C. (2019). Challenges, Approaches and Solutions in Data Integration for Research and Innovation. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds) Springer Handbook of Science and Technology Indicators. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-02511-3_15

Download citation

Publish with us

Policies and ethics