Advertisement

A Lexical Resource for Identifying Public Services Names on the Social Web

  • Islam A. HassanEmail author
  • Adegboyega Ojo
  • Lukasz Porwol
Chapter

Abstract

Discovery of government-related resources on the social web through mentions of government-related terms requires domain-specific lexical resources. This chapter describes an approach for developing a Lexical Resource for Public Services Names and how it could be exploited. Central to our technical approach is the development of a Semantic Alignment Algorithm, which organizes a set of public service names automatically captured from government websites in a semantic network based on a semantic relatedness measure (Explicit Semantic Analysis—ESA). To demonstrate the use of the developed lexicon, we: (1) clustered the United Kingdom and Irish Government public services catalogue for easier access to related services on citizens portals and (2) developed a Named Entity Recognizer (NER) to identify mentions of public service related information in a twitter stream. Evaluation of the semantic relations in the developed lexical resource computed by our semantic alignment algorithm showed the accuracy (specifically the F-Score ranged from 0.65 to 0.93.

Keywords

Government 3.0 Lexical resources Linguistic linked data resource Public service catalogues Core public service vocabulary Explicit semantic analysis 

References

  1. 1.
    Agichtein, E., et al. (2008). Finding high-quality content in social media. In Proceedings of the International Conference on Web Search and Web Data Mining—WSDM08 (p. 183).Google Scholar
  2. 2.
    Alani, H., et al. (2003). Web based knowledge extraction and consolidation for automatic ontology instantiation. Available at: http://eprints.soton.ac.uk/258325/1/Alani-SEMANNOT-camera-ready.pdf. Accessed June 5, 2014.
  3. 3.
    Alfonseca, E., & Manandhar, S. (2002). An unsupervised method for general named entity recognition and automated concept discovery. In … Conference on General …. Available at: http://www-users.cs.york.ac.uk/~suresh/papers/AUMFGNERAACD.pdf. Accessed July 1, 2013.
  4. 4.
    Amato, F., et al. (2009). Semantic management of multimedia documents for e-Government activity. In 2009 International Conference on Complex, Intelligent and Software Intensive Systems (pp. 1193–1198). Available at: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5066947. Accessed September 20, 2014.
  5. 5.
    Anon. (2015). DCMI Metadata terms, Dublin core metadata initiative. Available at: http://dublincore.org/documents/dcmi-terms/. Accessed April 10, 2015.
  6. 6.
    Anon. (2015). e-government core vocabularies: The SEMIC.EU approach. Retrieved from European Commission—Directorate-General Informatics. Available at: http://www.semic.eu/semic/view/documents/egov-core-vocabularies.pdf. Accessed April 10, 2015.
  7. 7.
    Anon. (2010). European interoperability framework (EIF)—Towards interoperability for European public services, p. 6. Available at: http://ec.europa.eu/isa/documents/isa_annex_ii_eif_en.pdf. Accessed April 10, 2015.
  8. 8.
    Asahara, M., & Matsumoto, Y. (2003). Japanese named entity extraction with redundant morphological analysis. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology—NAACL ‘03 (pp. 8–15). Morristown, NJ, USA: Association for Computational Linguistics. Available at: http://dl.acm.org/citation.cfm?id=1073445.1073447. Accessed June 5, 2014.
  9. 9.
    Ashley, H., et al. (2009). Change at hand: Web 2.0 for development. Participatory Learning and Action, 59, 8–20.Google Scholar
  10. 10.
    Banerjee, S., & Pedersen, T. (2002). An adapted Lesk algorithm for word sense disambiguation using WordNet. In Computational linguistics and intelligent text processing (pp. 136–145). Berlin Heidelberg: Springer.Google Scholar
  11. 11.
    Bernadette, H., Atemezing, G., & Villazón-Terrazas, B. (2014). Best practices for publishing linked data. Available at: http://www.w3.org/TR/ld-bp/. Accessed April 10, 2015.
  12. 12.
    Berners-Lee, T. J. (1992). The world-wide web. Computer Networks and ISDN Systems, 25(4–5), 454–459. Available at: http://www.sciencedirect.com/science/article/pii/016975529290039S. Accessed June 2, 2014.
  13. 13.
    Berners-Lee, T. (1989). The original proposal of the WWW, HTMLized. Available at: http://www.w3.org/History/1989/proposal.html
  14. 14.
    Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American. Available at: http://www.scientificamerican.com/article/the-semantic-web/
  15. 15.
    Bikel, D. M., et al. (1997). Nymble. In Proceedings of the Fifth Conference on Applied Natural Language Processing (pp. 194–201). Morristown, NJ, USA: Association for Computational Linguistics. Available at: http://dl.acm.org/citation.cfm?id=974557.974586. Accessed June 5, 2014.
  16. 16.
    Bizer, C., et al. (2009). DBpedia—A crystallization point for the web of data. Web Semantics: Science, Services and Agents on the World Wide Web, 7(3), 154–165. Available at: http://www.sciencedirect.com/science/article/pii/S1570826809000225. Accessed May 24, 2014.
  17. 17.
    Bodenreider, O., & McCray, A. T. (1998). From French vocabulary to the unified medical language system: A preliminary study. Studies in Health Technology and Informatics, 52 (1), 670–674. Available at: http://www.ncbi.nlm.nih.gov/pubmed/10384539
  18. 18.
    Borthwick, A., & Sterling, J. (1998). NYU: Description of the MENE named entity system as used in MUC-7. … Conference (MUC-7. Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.6430. Accessed June 5, 2014.
  19. 19.
    Bountouri, L., et al. (2008). Metadata interoperability in public sector information. Journal of Information Science, 35(2), 204–231. Available at: http://jis.sagepub.com/cgi/doi/10.1177/0165551508098601. Accessed September 19, 2014.
  20. 20.
    Buitelaar, P., & Ramaka, S. (2005). Unsupervised ontology-based semantic tagging for knowledge markup. Workshop on Learning in Web Search at 22nd …. Available at: http://cosco.hiit.fi/search/learninginsearch05/ICMLW4-LWS.pdf#page=34. Accessed July 9, 2013.
  21. 21.
    Chang, A. (2008) Leveraging web 2.0 in government E-government/technology series leveraging web 2.0 in government.Google Scholar
  22. 22.
    Charalabidis, Y., & Loukis, E. (2011). Transforming government agencies’ approach to eParticipation through efficient exploitation of social media.Google Scholar
  23. 23.
    Claes, A., et al. (2010, December). WeGOV project: Where eGovernment meets the eSociety, initial WeGov toolbox (pp. 1–65).Google Scholar
  24. 24.
    Dalianis, H., Rosell, M., & Sneiders, E. (2010). Clustering e-mails for the Swedish social insurance agency—What part of the e-mail thread gives the best quality ? (pp. 115–120).Google Scholar
  25. 25.
    Dalvi, B., Cohen, W., & Callan, J. (2012). Websets: Extracting sets of entities from the web using unsupervised information extraction. In … ACM international conference on Web …. Available at: http://www.cs.cmu.edu/~bbd/wsdm2012.pdf. Accessed June 5, 2014.
  26. 26.
    Davis, B., et al. (2010). Squeezing lemon with GATE.Google Scholar
  27. 27.
    Denny, J. C., et al. (2003). “Understanding” medical school curriculum content using KnowledgeMap. Journal of the American Medical Informatics Association: JAMIA, 10(4), 351–62. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=181986&tool=pmcentrez&rendertype=abstract
  28. 28.
    Ding, L., Peristeras, V., & Hausenblas, M. (2012). Government data. IEEE Computer, (January 2010), 11–15.Google Scholar
  29. 29.
    Embley, D. W., et al. (1998). Ontology-based extraction and structuring of information from data-rich unstructured documents. In Proceedings of the Seventh International Conference on Information and Knowledge Management—CIKM ’98 (pp. 52–59). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=288627.288641. Accessed June 5, 2014.
  30. 30.
    Etzioni, O., et al. (2008). Open information extraction from the web. Communications of the ACM, 51(12), 68. Available at: http://dl.acm.org/ft_gateway.cfm?id=1409378&type=html. Accessed June 5, 2014.
  31. 31.
    Etzioni, O., et al. (2005). Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1), 91–134. Available at: http://www.sciencedirect.com/science/article/pii/S0004370205000366. Accessed May 26, 2014.
  32. 32.
    Fader, A., Soderland, S., & Etzioni, O. (2011). Identifying relations for open information extraction. In Proceedings of the Conference on …, (pp. 1535–1545). Available at: http://dl.acm.org/citation.cfm?id=2145432.2145596. Accessed June 5, 2014.
  33. 33.
    Frank, M., & Eric, M. (2014) RDF primer. Available at: http://www.w3.org/TR/rdf-primer/
  34. 34.
    Freitag, D., & McCallum, A. (1999). Information extraction with HMMs and shrinkage. In … on machine learning for information extraction. Available at: http://www.aaai.org/Papers/Workshops/1999/WS-99-11/WS99-11-006.pdf. Accessed June 5, 2014.
  35. 35.
    French, L., et al. (2009). Application and evaluation of automated semantic annotation of gene expression experiments. Bioinformatics (Oxford, England), 25(12), 1543–1549. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2687992&tool=pmcentrez&rendertype=abstract. Accessed September 19, 2014.
  36. 36.
    Gabrilovich, E., & Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: IJCAI International Joint Conference on Artificial Intelligence (pp. 1606–1611).Google Scholar
  37. 37.
    Gao, H., Barbier, G., & Goolsby, R. (2011). Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intelligent Systems, 26, 10–14.CrossRefGoogle Scholar
  38. 38.
    García-Sánchez, F., et al. (2011). Applying intelligent agents and semantic web services in eGovernment environments. Expert Systems, p.no–no. Available at: http://doi.wiley.com/10.1111/j.1468-0394.2011.00586.x. Accessed September 19, 2014.
  39. 39.
    Gheorghiu, C., & Nicolescu, R. (2011). SIGMA-semantIc government mash-up application: Using semantic web technologies to provide access to governmental data. … (ISPDC), 2011 10th …. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6108280. Accessed June 5, 2014.
  40. 40.
    Goudos, S. K., et al. (2007). Public administration domain ontology for a semantic web services e-government framework 2. In Related work : E-government models and 3. The governance enterprise architecture., (Scc).Google Scholar
  41. 41.
    Grishman, R. (1996). Message understanding conference-6: A brief history. In Proceedings of COLING, 96.Google Scholar
  42. 42.
    Gruber, T. R. (1995). Toward principles for the design of ontologies used for knowledge sharing? International Journal of Human-Computer Studies, 43(5–6), 907–928. Available at: http://www.sciencedirect.com/science/article/pii/S1071581985710816. Accessed June 2, 2014.
  43. 43.
    Hayes, P. F. P.-S. P., & Ian, H. (2004). OWL web ontology language semantics and abstract syntax. Available at: http://www.w3.org/TR/owl-semantics/
  44. 44.
    Hevner, A., & Chatterjee, S. (2010). Design research in information systems. In Integrated series in information systems. Integrated series in information systems (pp. 9–23). Boston, MA: Springer US.Google Scholar
  45. 45.
    Hinkelmann, K., Thönssen, B., & Probst, F. (2006). Reference modeling and lifecycle management for e-government services., (Imi).Google Scholar
  46. 46.
    Hirst, G., & St-Onge, D. (1998). Lexical chains as representations of context for the detection and correction of malapropisms. WordNet: An electronic lexical database, 305, 305–332.Google Scholar
  47. 47.
    Hoffart, J., et al. (2011). Robust disambiguation of named entities in text, (pp. 782–792). Available at: http://dl.acm.org/citation.cfm?id=2145432.2145521. Accessed June 5, 2014.
  48. 48.
    Hoxha, J., & Brahaj, A. (2011). Open government data on the web: A semantic approach. Emerging intelligent data and web …. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6076428. Accessed June 5, 2014.
  49. 49.
    Hreño, J., et al. (2011, April). Integration of government services using semantic technologies. Journal of theoretical and …. Available at: http://www.scielo.cl/scielo.php?pid=S0718-18762011000100010&script=sci_arttext&tlng=pt. Accessed April 2, 2014.
  50. 50.
    Ji, H., & Grishman, R. (2006). Data selection in semi-supervised learning for name tagging, (pp. 48–55). Available at: http://dl.acm.org/citation.cfm?id=1641408.1641414. Accessed June 5, 2014.
  51. 51.
    Jiang, J. J., & Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy.arXiv:preprint cmp-lg/9709008.
  52. 52.
    Kamegai, S. (2002). Toward ontology-based knowledge extraction from biomedical literature. Genome informatics …, 577(2002), 576–577. Available at: http://jsbi2013.sakura.ne.jp/pdfs/journal1/GIW02/GIW02P078.pdf. Accessed July 9, 2013.
  53. 53.
    Klischewski, R. (2003). Semantic web for e-government. Electronic government. Available at: http://link.springer.com/chapter/10.1007/10929179_52. Accessed June 5, 2014.
  54. 54.
    Ku, C. H., et al. (2006). Natural language processing and e-government: Crime information extraction from heterogeneous data sources. In The proceedings of the 9th Annual International Digital Government Research Conference (pp. 162–170). ACM International Conference Proceedings Series, ACM Press.Google Scholar
  55. 55.
    Kulkarni, S. et al. (2009). Collective annotation of Wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD ’09 (p. 457). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=1557019.1557073. Accessed May 28, 2014.
  56. 56.
    Kuzma, J. (2010, March). Asian Government Usage of Web 2. 0 Social Media. Sites The Journal Of 20th Century Contemporary French Studies (pp. 1–13).Google Scholar
  57. 57.
    Leacock, C., Miller, G. A., & Chodorow, M. (1998). Using corpus statistics and WordNet relations for sense identification. Computational Linguistics, 24(1), 147–165.Google Scholar
  58. 58.
    Lin, D. (1998). An information-theoretic definition of similarity. In ICML(Vol. 98, pp. 296–304).Google Scholar
  59. 59.
    Lin, T., Etzioni, O., & Fogarty, J. (2009). Identifying interesting assertions from the web. In Proceeding of the 18th ACM Conference on Information and Knowledge Management—CIKM ’09 (p. 1787). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=1645953.1646230. Accessed June 5, 2014.
  60. 60.
    Loutas, N., Lee, D., & Maali, F. (2011). The semantic public service portal (S-PSP). In G. Antoniou (Ed.), ESWC 2011. LNCS 6644 (pp. 227–242). Springer-Verlag. Available at: http://link.springer.com/chapter/10.1007/978-3-642-21064-8_16. Accessed August 19, 2014.
  61. 61.
    Macintosh, A., Coleman, S., & Schneeberger, A. (2009). eParticipation: The research gaps (pp. 1–11).Google Scholar
  62. 62.
    Makinen, M., & Wangu Kuira, M. (2008). Social media and postelection crisis in Kenya. The International Journal of Press/Politics, 13(3), 328–335.CrossRefGoogle Scholar
  63. 63.
    Markellos, K., et al. (2007). Semantic web search for e-government: The case study of intrastat 1 introduction. Journal of Internet Technology, 8(4), 457–468.Google Scholar
  64. 64.
    Marshall, C. C., & Shipman, F. M. (2003). Which semantic web? In Proceedings of the Fourteenth ACM Conference on Hypertext and Hypermedia—HYPERTEXT ’03 (p. 57). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=900051.900063. Accessed June 5, 2014.
  65. 65.
    McCallum, A., & Li, W. (2003). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 (pp. 188–191). Morristown, NJ, USA: Association for Computational Linguistics. Available at: http://dl.acm.org/citation.cfm?id=1119176.1119206. Accessed June 5, 2014.
  66. 66.
    McGuinness, D. L., & van Harmelen, F. (2004). OWL web ontology language overview. Available at: http://www.w3.org/TR/owl-features/
  67. 67.
    Mccrae, J., et al. (2012). Interchanging lexical resources on the semantic web. Language Resources and Evaluation, 46(4), 701–719.CrossRefGoogle Scholar
  68. 68.
    Mccrae, J., Spohr, D., & Cimiano, P. (2011). Linking lexical resources and ontologies on the semantic web with lemon. In ESWC’11 Proceedings of the 8th Extended Semantic Web Conference—Volume Part I (pp. 245–259).Google Scholar
  69. 69.
    Medjahed, B., Bouguettaya, A., & Ouzzani, M. (2003). Semantic web enabled e-government services. In Proceedings of the 2003 …. Available at: http://dl.acm.org/citation.cfm?id=1123287. Accessed June 5, 2014.
  70. 70.
    Miller, S., Crystal, M., & Fox, H. (1998). Algorithms that learn to extract information BBN: Description of the sift system as used for MUC-7. In Conference (MUC-7). Available at: http://aclweb.org/anthology//M/M98/M98-1009.pdf. Accessed June 5, 2014.
  71. 71.
    Misuraca, G., Broster, D., & Centeno, C. (2012). Digital Europe 2030: Designing scenarios for ICT in future governance and policy making. Government Information Quarterly, 29, S121–S131. Available at: http://www.sciencedirect.com/science/article/pii/S0740624X11000724. Accessed June 5, 2014.
  72. 72.
    Moreno-ortiz, A., & Hernández, C. P. (2013). Lexicon—Based sentiment analysis of twitter messages in Spanish (pp. 93–100).Google Scholar
  73. 73.
    Nadeau, D., & Sekine, S. (2007). A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1), 3–26. Available at: http://www.ingentaconnect.com/content/jbp/li/2007/00000030/00000001/art00002. Accessed June 3, 2014.
  74. 74.
    Nam, T. (2013). Government 3.0 in Korea: Fad or fashion? In Proceedings of the 7th International Conference on …. Available at: http://dl.acm.org/citation.cfm?id=2591896. Accessed August 19, 2014.
  75. 75.
    Ojo, A., Estevez, E., & Janowski, T. (2010). Semantic interoperability architecture for Governance 2.0. Information Polity, 15(1), 105–123.Google Scholar
  76. 76.
    Ortiz-rodr, F., & Villaz, B. (2006). Legal Ontologies for the Spanish e-Government (pp. 301–310).Google Scholar
  77. 77.
    O’reilly, T. (2007). What is web 2.0: Design patterns and business models for the next generation of software. Communications & Strategies, 65(4578), 17–37.Google Scholar
  78. 78.
    Panopoulou, E., Tambouris, E. & Tarabanis, K. (2010). eParticipation initiatives in Europe : Learning from practitioners. Ifip International Federation for Information Processing (pp. 54–65).Google Scholar
  79. 79.
    Peffers, K., et al. (2007). A design science research methodology for information systems research. Journal of Management Information Systems, 24(3), 45–77.CrossRefGoogle Scholar
  80. 80.
    Phil, A., Stijn, G., & Nikolaos, L. (2013). Core public service vocabulary specification. Available at: https://joinup.ec.europa.eu/sites/default/files/72/94/04/D5.1.2—Core Public Service Vocabulary specification v0.05.pdf. Accessed April 10, 2015.
  81. 81.
    Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. arXiv:https://preprint cmplg/ 9511007
  82. 82.
    Reuter, C., & Marx, A. (2011, May). Social software as an infrastructure for crisis management—A case study about current practice and potential usage. In Proceedings of the 8th International ISCRAM Conference (pp. 1–10).Google Scholar
  83. 83.
    Schreiber, R., & Swick, G. (2006). Semantic web best practices and deployment working group. Available at: http://www.w3.org/2001/sw/BestPractices/
  84. 84.
    Sheth, A., et al. (2002). Managing semantic content for the Web. IEEE Internet Computing, 6(4), 80–87. Available at: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1020330. Accessed June 5, 2014.
  85. 85.
    Skounakis, M., Craven, M., & Ray, S. (2003). Hierarchical hidden markov models for information extraction. IJCAI. Available at: http://papercut.googlecode.com/hg-history/98464ac0efb47c55159b313c89b0b305ba1d83f9/PaperCutTesting/targetPDF/success/hhmm.pdf. Accessed June 5, 2014.
  86. 86.
    Small, S., & Medsker, L. (2013). Review of information extraction technologies and applications. Neural computing and applications. Available at: http://link.springer.com/article/10.1007/s00521-013-1516-6. Accessed June 5, 2014.
  87. 87.
    Soderland, S. (1999). Learning information extraction rules for semi-structured and free text. Machine learning, 34(1–3), 233–272. Available at: http://link.springer.com/article/10.1023/A:1007562322031. Accessed June 5, 2014.
  88. 88.
    Stadlhofer, B., Salhofer, P., & Tretter, G. (2009). Ontology driven E-government. In 2009 Fourth International Conference on Systems, 7(4), 251–255. Available at: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4976353
  89. 89.
    Suchanek, F., Kasneci, G., & Weikum, G. (2007). Yago: A core of semantic knowledge. In Proceedings of the 16th … (p. 697). Available at: http://dl.acm.org/citation.cfm?id=1242572.1242667. Accessed June 5, 2014.
  90. 90.
    Tsai, T. -M., et al. (2003). Ontology-mediated integration of intranet web services. Computer, 36(10), 63–71. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1236473. Accessed June 5, 2014.
  91. 91.
    Uschold, M., & Gruninger, M. (2009). Ontologies: Principles, methods and applications. The Knowledge Engineering Review, 11(02), 93. Available at: http://journals.cambridge.org/abstract_S0269888900007797. Accessed June 5, 2014.
  92. 92.
    Vossen, P., & Rambousek, A. (2008). A distributed database system for developing ontological and lexical resources in harmony. In Computational linguistics and intelligent text processing (pp. 1–15).Google Scholar
  93. 93.
    Wu, Z., & Palmer, M. (1994). Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics (pp. 133–138). Association for Computational Linguistics.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Islam A. Hassan
    • 1
    Email author
  • Adegboyega Ojo
    • 1
  • Lukasz Porwol
    • 1
  1. 1.Insight Centre for Data AnalyticsNational University of Ireland GalwayGalwayRepublic of Ireland

Personalised recommendations