Abstract
This paper presents a novel method of generating and evaluating linguistic summaries of content stored in distributed graph datasets, like LinkedData. Linguistic summarization is a well known data mining technique, aimed to discover patterns in data and present them in natural language. So far, this method has been researched only for relational databases. In our recent paper we have presented how to adapt this method for graph datasets. We have solved the problems of subject definition (further extended in this paper), retrieval of the attributes for summarization, generalization of summarizers and qualifiers. In this paper we extend that research by adapting proposed method to distributed interlinked graph datasets, which results in obtaining new summaries, and therefore new knowledge. We discuss how to follow different types of equivalence links that may exists between graph datasets. In order to measure characteristics specific for summaries of distributed graph data we propose new truth values (degree of subject appropriateness, degree of summarizer order and degree of linkage), and adapt existing ones (degree of covering). We run several experiments on Linked Data and discuss the results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1), 1:1–1:39 (2008)
Hausenblas, M., Halb, W., Raimond, Y., Heath, T.: What is the size of the semantic web. In: Proceedings of the International Conference on Semantic Systems. ISemantics 2008 (2008)
Yager, R.R.: A new approach to the summarization of data. Inf. Sci. 28(1), 69–86 (1982)
Kacprzyk, J., Yager, R.R., Zadrożny, S.: A fuzzy logic based approach to linguistic summaries of databases. Int. J. Appl. Math. Comput. Sci. 10(4), 813–834 (2000)
Kacprzyk, J., Wilbik, A., Zadrozny, S.: An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation. Int. J. Intell. Syst. 25(5), 411–439 (2010)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor. Newsl. 1(2), 12–23 (2000)
Kosala, R., Blockeel, H.: Web mining research: a survey. SIGKDD Explor. Newsl. 2(1), 1–15 (2000)
Stumme, G., Hotho, A., Berendt, B.: Semantic web mining: state of the art and future directions. Web Seman. Sci. Serv. Agents World Wide Web 4(2), 124–143 (2006). Semantic Grid - The Convergence of Technologies
Aggarwal, C.C., Wang, H.: Managing and Mining Graph Data, 1st edn. Springer Publishing Company, Incorporated, US (2010)
Cook, D.J., Holder, L.B.: Mining Graph Data. John Wiley & Sons, Hoboken (2006)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the 2001 IEEE International Conference on Data Mining. ICDM 2001, Computer Society, pp. 313–320. IEEE, Washington, DC (2001)
Yan, X., Han, J.: gspan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining. ICDM 2002, Computer Society 721-724. IEEE, Washington, DC (2002)
Castelltort, A., Laurent, A.: Fuzzy queries over NoSQL graph databases: perspectives for extending the cypher language. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2014, Part III. CCIS, vol. 444, pp. 384–395. Springer, Heidelberg (2014)
Strobin, L., Niewiadomski, A.: Linguistic summaries of graph datasets using ontologies: an application to semantic web. In: Núñez, M., Nguyen, N.T., Camacho, D., Trawinski, B. (eds.) ICCCI 2015. LNCS, vol. 9329, pp. 380–389. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24069-5_36
Castelltort, A., Laurent, A.: Extracting fuzzy summaries from nosql graph databases. In: Andreasen, T., Christiansen, H., Kacprzyk, J., Larsen, H., Pasi, G., Pivert, O., De Tré, G., Vila, M.A., Yazici, A., Zadrozny, S. (eds.) Flexible Query Answering Systems 2015. Advances in Intelligent Systems and Computing, vol. 400, pp. 189–200. Springer International Publishing, Switzerland (2016)
Lehmann, J.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2014)
Cingolani, P., Alcalá-Fdez, J.: jfuzzylogic: a robust and flexible fuzzy-logic inference system language implementation. In: FUZZ-IEEE, pp. 1–8. IEEE (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Strobin, L., Niewiadomski, A. (2016). Integration of Multiple Graph Datasets and Their Linguistic Summaries: An Application to Linked Data. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2016. Lecture Notes in Computer Science(), vol 9692. Springer, Cham. https://doi.org/10.1007/978-3-319-39378-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-39378-0_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39377-3
Online ISBN: 978-3-319-39378-0
eBook Packages: Computer ScienceComputer Science (R0)