Advertisement

Ontology Population from Web Product Information

  • Damir Vandic
  • Lennart J. Nederstigt
  • Steven S. Aanen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8823)

Abstract

Due to the explosion of information on the Web, there is a need to structure Web data in order to make it accessible to both users and machines. E-commerce is one of the areas in which increasing data volume on the Web has serious consequences. This paper proposes a framework that populates tabular product information from Web shops in a product ontology. By formalizing product information in this way, one can make better product comparison or recommender applications on the Web. Our approach makes use of lexical and syntactic matching techniques for mapping properties and instantiating values. The performed evaluation shows that instantiating TVs and MP3 players from two popular Web shops, Best Buy and Newegg.com, results in an F1 score of 95.07% for property mapping and 76.60% for value instantiation.

Keywords

Product Information Regular Expression Product Class Lexical Representation Baseline Approach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aanen, S.S., Nederstigt, L.J., Vandić, D., Frăsincar, F.: SCHEMA - an algorithm for automated product taxonomy mapping in E-commerce. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 300–314. Springer, Heidelberg (2012)Google Scholar
  2. 2.
    de Bakker, M., Frasincar, F., Vandic, D.: A Hybrid Model Words-Driven Approach for Web Product Duplicate Detection. In: Salinesi, C., Norrie, M.C., Pastor, Ó. (eds.) CAiSE 2013. LNCS, vol. 7908, pp. 149–161. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  3. 3.
    de Bakker, M., Frasincar, F., Vandic, D., Kaymak, U.: Model Words-Driven Approaches for Duplicate Detection on the Web. In: 28th Symposium On Applied Computing (SAC 2013), pp. 717–723. ACM (2013)Google Scholar
  4. 4.
    Berrueta, D., Polo, L.: MUO — An Ontology to Represent Units of Measurement in RDF (2009), http://goo.gl/Gzyz2a
  5. 5.
    Bing, Google, Yahoo! and Yandex: schema.org (2014), http://schema.org
  6. 6.
    Celjuska, D., Vargas-Vera, M.: Ontosophie: A Semi-automatic System for Ontology Population from Text. In: 3rd International Conference on Natural Language Processing, ICON 2004 (2004)Google Scholar
  7. 7.
    Chang, C., Kayed, M., Girgis, R., Shaalan, K.: A Survey of Web Information Extraction Systems. IEEE Transactions on Knowledge and Data Engineering 18(10), 1411–1428 (2006)CrossRefGoogle Scholar
  8. 8.
    Google: Knowledge Graph (2014), http://goo.gl/wgswGe
  9. 9.
    Guarino, N., Welty, C.: Evaluating ontological decisions with OntoClean. Communications of the ACM 45(2), 61–65 (2002)CrossRefGoogle Scholar
  10. 10.
    Hepp, M.: GoodRelations: An Ontology for Describing Products and Services Offers on the Web. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 329–346. Springer, Heidelberg (2008)Google Scholar
  11. 11.
    Holzinger, W., Krüpl, B., Herzog, M.: Using Ontologies for Extracting Product Features from Web Pages. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 286–299. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Nederstigt, L.J., Aanen, S.S., Vandić, D., Frăsincar, F.: An automatic approach for mapping product taxonomies in E-commerce systems. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds.) CAiSE 2012. LNCS, vol. 7328, pp. 334–349. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Patel, C., Supekar, K., Lee, Y.: Ontogenie: Extracting Ontology Instances from WWW. In: Workshop on Human Language Technology for the Semantic Web and Web Services, Springer (2003)Google Scholar
  14. 14.
    Sucharita Mulpuru: US eCommerce Grows, Reaching $414B by 2018, but Physical Stores Will Live On (2014), http://goo.gl/Y3gyVI
  15. 15.
    Vandic, D., van Dam, J.W., Frasincar, F.: Faceted Product Search Powered by the Semantic Web. Decision Support Systems 53(3), 425–437 (2012)CrossRefGoogle Scholar
  16. 16.
    VijayaLakshmi, B., GauthamiLatha, A., Srinivas, D.Y., Rajesh, K.: Perspectives of Semantic Web in E- Commerce. International Journal of Computer Applications 25(10), 52–56 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Damir Vandic
    • 1
  • Lennart J. Nederstigt
    • 1
  • Steven S. Aanen
    • 1
  1. 1.Erasmus University RotterdamRotterdamThe Netherlands

Personalised recommendations