Advertisement

From Web Data to Entities and Back

  • Zoltán Miklós
  • Nicolas Bonvin
  • Paolo Bouquet
  • Michele Catasta
  • Daniele Cordioli
  • Peter Fankhauser
  • Julien Gaugaz
  • Ekaterini Ioannou
  • Hristo Koshutanski
  • Antonio Maña
  • Claudia Niederée
  • Themis Palpanas
  • Heiko Stoermer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6051)

Abstract

We present the Entity Name System (ENS), an enabling infrastructure, which can host descriptions of named entities and provide unique identifiers, on large-scale. In this way, it opens new perspectives to realize entity-oriented, rather than keyword-oriented, Web information systems. We describe the architecture and the functionality of the ENS, along with tools, which all contribute to realize the Web of entities.

Keywords

entity Web unique identifier 

References

  1. 1.
    Bautin, M., Skiena, S.: Concordance-Based Entity-Oriented Search. Web Intelligence and Agent Systems 7(4), 303–320 (2007)Google Scholar
  2. 2.
    Bazzanella, B., Chaudhry, J.A., Palpanas, T., Stoermer, H.: Towards a general entity representation model. In: SWAP (2008)Google Scholar
  3. 3.
    Benjelloun, O., Garcia-Molina, H., Menestrina, D., Su, Q., Whang, S.E., Widom, J.: Swoosh: a generic approach to entity resolution. The VLDB Journal 18(1), 255–276 (2009)CrossRefGoogle Scholar
  4. 4.
    Bilenko, M., Mooney, R.J., Cohen, W.W., Ravikumar, P., Fienberg, S.E.: Adaptive name matching in information integration. IEEE Intelligent Systems 18(5) (September 2003)Google Scholar
  5. 5.
    Bouquet, P., Palpanas, T., Stoermer, H., Vignolo, M.: A Conceptual Model for a Web-Scale Entity Name System. In: Gómez-Pérez, A., Yu, Y., Ding, Y. (eds.) ASWC 2009. LNCS, vol. 5926, pp. 46–60. Springer, Heidelberg (2009)Google Scholar
  6. 6.
    Bouquet, P., Stoermer, H., Barczynski, W., Bocconi, S.: Entity-centric Semantic Interoperability. In: Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications, pp. 1–21. IGI Global (2009)Google Scholar
  7. 7.
    Bouquet, P., Stoermer, H., Bazzanella, B.: An Entity Name System (ENS) for the Semantic Web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 258–272. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Dean, J.: Challenges in building large-scale information retrieval systems (invited talk). In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM (2009)Google Scholar
  9. 9.
    Doan, A., Lu, Y., Lee, Y., Han, J.: Object matching for information integration: A profiler-based approach. In: Proceedings of IJCAI 2003 Workshop on Information Integration on the Web (IIWeb 2003), pp. 53–58 (2003)Google Scholar
  10. 10.
    Dong, X., Halevy, A., Madhavan, J.: Reference reconciliation in complex information spaces. In: SIGMOD, pp. 85–96 (2005)Google Scholar
  11. 11.
    Dong, X., Halevy, A.Y.: Indexing dataspaces. In: SIGMOD, pp. 43–54 (2007)Google Scholar
  12. 12.
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)CrossRefGoogle Scholar
  13. 13.
    Hernández, M.A., Stolfo, S.J.: Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem. Data Min. Knowl. Discov. 2(1), 9–37 (1998)CrossRefGoogle Scholar
  14. 14.
    Ioannou, E., Niedere, C., Nejdl, W.: Probabilistic entity linkage for heterogeneous information spaces. In: Bellahsène, Z., Léonard, M. (eds.) CAiSE 2008. LNCS, vol. 5074, pp. 556–570. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Jaffri, A., Glaser, H., Millard, I.: URI Identity Management for Semantic Web Data Integration and Linkage. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2007, Part II. LNCS, vol. 4806, pp. 1125–1134. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  16. 16.
    Kilov, H.: Business Specifications: The Key to Successful Software Engineering. Prentice Hall PTR, Upper Saddle River (1998)Google Scholar
  17. 17.
    Koshutanski, H., Massacci, F.: A negotiation scheme for access rights establishment in autonomic communication. Journal of Network and System Management 15(1) (March 2007)Google Scholar
  18. 18.
    Liu, X., Stoermer, H., Bouquet, P., Wang, S.: Supporting the Reuse of Global Unique Identifiers for Individuals in OWL/RDF Knowledge Bases (demo paper). In: Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications (ESWC). LNCS, vol. 5554, pp. 868–872. Springer, Heidelberg (2009)Google Scholar
  19. 19.
    Manerikar, N., Palpanas, T.: Frequent Items in Streaming Data An Experimental Evaluation of the State-of-the-Art. Data Knowl. Eng. 68(4), 415–430 (2009)CrossRefGoogle Scholar
  20. 20.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)zbMATHGoogle Scholar
  21. 21.
    Morris, A., Velegrakis, Y., Bouquet, P.: Entity Identification on the Semantic Web. In: Proceedings of the 5th Workshop on Semantic Web Applications and Perspectives, SWAP 2008 (2008)Google Scholar
  22. 22.
  23. 23.
  24. 24.
    Palpanas, T., Chaudhry, J.A., Andritsos, P., Velegrakis, Y.: Entity Data Management in OKKAM. In: Proceedings of the 2008 19th International Conference on Database and Expert Systems Application (DEXA), pp. 729–733. IEEE, Los Alamitos (2008)CrossRefGoogle Scholar
  25. 25.
    Ravana, S.D., Moffat, A.: Score aggregation techniques in retrieval experimentation. In: ADC, pp. 59–67 (2009)Google Scholar
  26. 26.
    Robertson, S.: On gmap: and other transformations. In: CIKM 2006: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 78–83. ACM, New York (2006)CrossRefGoogle Scholar
  27. 27.
    Sarawagi, S., Bhamidipaty, A.: Interactive deduplication using active learning. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 269–278 (2002)Google Scholar
  28. 28.
    Tantono, F.I., Manerikar, N., Palpanas, T.: Efficiently Discovering Recent Frequent Items in Data Streams. In: Ludäscher, B., Mamoulis, N. (eds.) SSDBM 2008. LNCS, vol. 5069, pp. 222–239. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  29. 29.
    X.509. The directory: Public-key and attribute certificate frameworks, ITU-T Recommendation X.509:2005 ∣ ISO/IEC 9594-8:2005 (2005)Google Scholar
  30. 30.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Zoltán Miklós
    • 1
  • Nicolas Bonvin
    • 1
  • Paolo Bouquet
    • 2
  • Michele Catasta
    • 1
  • Daniele Cordioli
    • 3
  • Peter Fankhauser
    • 4
  • Julien Gaugaz
    • 4
  • Ekaterini Ioannou
    • 4
  • Hristo Koshutanski
    • 5
  • Antonio Maña
    • 5
  • Claudia Niederée
    • 4
  • Themis Palpanas
    • 1
  • Heiko Stoermer
    • 1
  1. 1.Ecole Polytechnique Fédérale de Lausanne (EPFL) 
  2. 2.DISI, University of Trento 
  3. 3.ExpertSystem s.p.a, ModenaItaly
  4. 4.L3S Research Center, Leibniz Universität Hannover 
  5. 5.Universidad de Málaga 

Personalised recommendations