Abstract
In the context of the Web of Data, plenty of properties may be used for linking resources to other resources but also to literals that specify their attributes. However the scale and inherent nature of the setting is also characterized by a large amount of missing and incorrect information. To tackle these problems, learning models and rules for predicting unknown values of numeric features can be used for approximating the values and enriching the schema of a knowledge base yielding an increase of the expressiveness, e.g. by eliciting SWRL rules. In this work, we tackle the problem of predicting unknown values and deriving rules concerning numeric features expressed as datatype properties. The task can be cast as a regression problem for which suitable solutions have been devised, for instance, in the related context of RDBs. To this purpose, we adapted learning predictive clustering trees for solving multi-target regression problems in the context of knowledge bases of the Web of Data. The approach has been experimentally evaluated showing interesting results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The source code and the benchmarks are available at: http://github.com/Giuseppe-Rizzo/DLPredictiveClustering.
- 2.
The ontologies are available also at: http://www.inf.unibz.it/tones/index.php (BCO, monetary), https://datahub.io/dataset (geopolitical), https://github.com/AKSW/DL-Learner/tree/develop/examples/mutagenesis (mutagenesis).
References
Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web. Morgan & Claypool, Palo Alto (2011)
Rettinger, A., Lösch, U., Tresp, V., d’Amato, C., Fanizzi, N.: Mining the semantic web. Data Min. Knowl. Discov. 24, 613–662 (2012)
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook, 2nd edn. Cambridge University Press, Cambridge (2007)
Badea, L., Nienhuys-Cheng, S.-H.: A refinement operator for description logics. In: Cussens, J., Frisch, A. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 40–59. Springer, Heidelberg (2000). doi:10.1007/3-540-44960-4_3
Fanizzi, N., d’Amato, C., Esposito, F., Minervini, P.: Numeric prediction on OWL knowledge bases through terminological regression trees. Int. J. Semant. Comput. 6, 429–446 (2012)
Breiman, L., Friedman, J.: Predicting multivariate responses in multiple linear regression. J. Roy. Stat. Soc. 59, 3–54 (1997)
Ženko, B., Džeroski, S., Struyf, J.: Learning predictive clustering rules. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 234–250. Springer, Heidelberg (2006). doi:10.1007/11733492_14
Struyf, J., Džeroski, S., Blockeel, H., Clare, A.: Hierarchical multi-classification with predictive clustering trees in functional genomics. In: Bento, C., Cardoso, A., Dias, G. (eds.) EPIA 2005. LNCS (LNAI), vol. 3808, pp. 272–283. Springer, Heidelberg (2005). doi:10.1007/11595014_27
Fanizzi, N., d’Amato, C., Esposito, F.: Induction of concepts in web ontologies through terminological decision trees. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6321, pp. 442–457. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15880-3_34
Blockeel, H., De Raedt, L.: Top-down induction of first-order logical decision trees. Artif. Intell. 101, 285–297 (1998)
Borchani, H., Varando, G., Bielza, C., Larrañaga, P.: A survey on multi-output regression. WIREs: Data Min. Knowl. Discov. 5, 216–233 (2015)
Aho, T., Ženko, B., Džeroski, S., Elomaa, T.: Multi-target regression with rule ensembles. J. Mach. Learn. Res. 13, 2367–2407 (2012)
Kocev, D., Vens, C., Struyf, J., Dzeroski, S.: Tree ensembles for predicting structured outputs. Pattern Recogn. 46, 817–833 (2013)
Dimitrovski, I., Kocev, D., Loskovska, S., Džeroski, S.: Fast and scalable image retrieval using predictive clustering trees. In: Fürnkranz, J., Hüllermeier, E., Higuchi, T. (eds.) DS 2013. LNCS (LNAI), vol. 8140, pp. 33–48. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40897-7_3
Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M.: SWRL: a semantic web rule language combining OWL and RuleML. Technical report (2004). https://www.w3.org/Submission/SWRL/
Blockeel, H., DeRaedt, L., Ramon, J.: Top-down induction of clustering trees. In: Proceedings of ICML1998, pp. 55–63. Morgan Kaufmann (1998)
Lehmann, J., Hitzler, P.: A refinement operator based learning algorithm for the \(\cal ALC\) description logic. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 147–160. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78469-2_17
Lehmann, J.: DL-learner: learning concepts in description logics. J. Mach. Learn. Res. 10, 2639–2642 (2009)
Appice, A., Džeroski, S.: Stepwise induction of multi-target model trees. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 502–509. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74958-5_46
Acknowledgments
This work fulfills the objectives of the PON 02005633489339 project “Puglia@Service - Internet-based Service Engineering enabling Smart Territory structural development” funded by the Italian Ministry of University and Research (MIUR).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Rizzo, G., d’Amato, C., Fanizzi, N., Esposito, F. (2016). Approximating Numeric Role Fillers via Predictive Clustering Trees for Knowledge Base Enrichment in the Web of Data. In: Calders, T., Ceci, M., Malerba, D. (eds) Discovery Science. DS 2016. Lecture Notes in Computer Science(), vol 9956. Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-46307-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46306-3
Online ISBN: 978-3-319-46307-0
eBook Packages: Computer ScienceComputer Science (R0)