Abstract
Discovery of government-related resources on the social web through mentions of government-related terms requires domain-specific lexical resources. This chapter describes an approach for developing a Lexical Resource for Public Services Names and how it could be exploited. Central to our technical approach is the development of a Semantic Alignment Algorithm, which organizes a set of public service names automatically captured from government websites in a semantic network based on a semantic relatedness measure (Explicit Semantic Analysis—ESA). To demonstrate the use of the developed lexicon, we: (1) clustered the United Kingdom and Irish Government public services catalogue for easier access to related services on citizens portals and (2) developed a Named Entity Recognizer (NER) to identify mentions of public service related information in a twitter stream. Evaluation of the semantic relations in the developed lexical resource computed by our semantic alignment algorithm showed the accuracy (specifically the F-Score ranged from 0.65 to 0.93.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agichtein, E., et al. (2008). Finding high-quality content in social media. In Proceedings of the International Conference on Web Search and Web Data Mining—WSDM ‘08 (p. 183).
Alani, H., et al. (2003). Web based knowledge extraction and consolidation for automatic ontology instantiation. Available at: http://eprints.soton.ac.uk/258325/1/Alani-SEMANNOT-camera-ready.pdf. Accessed June 5, 2014.
Alfonseca, E., & Manandhar, S. (2002). An unsupervised method for general named entity recognition and automated concept discovery. In … Conference on General …. Available at: http://www-users.cs.york.ac.uk/~suresh/papers/AUMFGNERAACD.pdf. Accessed July 1, 2013.
Amato, F., et al. (2009). Semantic management of multimedia documents for e-Government activity. In 2009 International Conference on Complex, Intelligent and Software Intensive Systems (pp. 1193–1198). Available at: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5066947. Accessed September 20, 2014.
Anon. (2015). DCMI Metadata terms, Dublin core metadata initiative. Available at: http://dublincore.org/documents/dcmi-terms/. Accessed April 10, 2015.
Anon. (2015). e-government core vocabularies: The SEMIC.EU approach. Retrieved from European Commission—Directorate-General Informatics. Available at: http://www.semic.eu/semic/view/documents/egov-core-vocabularies.pdf. Accessed April 10, 2015.
Anon. (2010). European interoperability framework (EIF)—Towards interoperability for European public services, p. 6. Available at: http://ec.europa.eu/isa/documents/isa_annex_ii_eif_en.pdf. Accessed April 10, 2015.
Asahara, M., & Matsumoto, Y. (2003). Japanese named entity extraction with redundant morphological analysis. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology—NAACL ‘03 (pp. 8–15). Morristown, NJ, USA: Association for Computational Linguistics. Available at: http://dl.acm.org/citation.cfm?id=1073445.1073447. Accessed June 5, 2014.
Ashley, H., et al. (2009). Change at hand: Web 2.0 for development. Participatory Learning and Action, 59, 8–20.
Banerjee, S., & Pedersen, T. (2002). An adapted Lesk algorithm for word sense disambiguation using WordNet. In Computational linguistics and intelligent text processing (pp. 136–145). Berlin Heidelberg: Springer.
Bernadette, H., Atemezing, G., & Villazón-Terrazas, B. (2014). Best practices for publishing linked data. Available at: http://www.w3.org/TR/ld-bp/. Accessed April 10, 2015.
Berners-Lee, T. J. (1992). The world-wide web. Computer Networks and ISDN Systems, 25(4–5), 454–459. Available at: http://www.sciencedirect.com/science/article/pii/016975529290039S. Accessed June 2, 2014.
Berners-Lee, T. (1989). The original proposal of the WWW, HTMLized. Available at: http://www.w3.org/History/1989/proposal.html
Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American. Available at: http://www.scientificamerican.com/article/the-semantic-web/
Bikel, D. M., et al. (1997). Nymble. In Proceedings of the Fifth Conference on Applied Natural Language Processing (pp. 194–201). Morristown, NJ, USA: Association for Computational Linguistics. Available at: http://dl.acm.org/citation.cfm?id=974557.974586. Accessed June 5, 2014.
Bizer, C., et al. (2009). DBpedia—A crystallization point for the web of data. Web Semantics: Science, Services and Agents on the World Wide Web, 7(3), 154–165. Available at: http://www.sciencedirect.com/science/article/pii/S1570826809000225. Accessed May 24, 2014.
Bodenreider, O., & McCray, A. T. (1998). From French vocabulary to the unified medical language system: A preliminary study. Studies in Health Technology and Informatics, 52 (1), 670–674. Available at: http://www.ncbi.nlm.nih.gov/pubmed/10384539
Borthwick, A., & Sterling, J. (1998). NYU: Description of the MENE named entity system as used in MUC-7. … Conference (MUC-7. Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.6430. Accessed June 5, 2014.
Bountouri, L., et al. (2008). Metadata interoperability in public sector information. Journal of Information Science, 35(2), 204–231. Available at: http://jis.sagepub.com/cgi/doi/10.1177/0165551508098601. Accessed September 19, 2014.
Buitelaar, P., & Ramaka, S. (2005). Unsupervised ontology-based semantic tagging for knowledge markup. Workshop on Learning in Web Search at 22nd …. Available at: http://cosco.hiit.fi/search/learninginsearch05/ICMLW4-LWS.pdf#page=34. Accessed July 9, 2013.
Chang, A. (2008) Leveraging web 2.0 in government E-government/technology series leveraging web 2.0 in government.
Charalabidis, Y., & Loukis, E. (2011). Transforming government agencies’ approach to eParticipation through efficient exploitation of social media.
Claes, A., et al. (2010, December). WeGOV project: Where eGovernment meets the eSociety, initial WeGov toolbox (pp. 1–65).
Dalianis, H., Rosell, M., & Sneiders, E. (2010). Clustering e-mails for the Swedish social insurance agency—What part of the e-mail thread gives the best quality ? (pp. 115–120).
Dalvi, B., Cohen, W., & Callan, J. (2012). Websets: Extracting sets of entities from the web using unsupervised information extraction. In … ACM international conference on Web …. Available at: http://www.cs.cmu.edu/~bbd/wsdm2012.pdf. Accessed June 5, 2014.
Davis, B., et al. (2010). Squeezing lemon with GATE.
Denny, J. C., et al. (2003). “Understanding” medical school curriculum content using KnowledgeMap. Journal of the American Medical Informatics Association: JAMIA, 10(4), 351–62. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=181986&tool=pmcentrez&rendertype=abstract
Ding, L., Peristeras, V., & Hausenblas, M. (2012). Government data. IEEE Computer, (January 2010), 11–15.
Embley, D. W., et al. (1998). Ontology-based extraction and structuring of information from data-rich unstructured documents. In Proceedings of the Seventh International Conference on Information and Knowledge Management—CIKM ’98 (pp. 52–59). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=288627.288641. Accessed June 5, 2014.
Etzioni, O., et al. (2008). Open information extraction from the web. Communications of the ACM, 51(12), 68. Available at: http://dl.acm.org/ft_gateway.cfm?id=1409378&type=html. Accessed June 5, 2014.
Etzioni, O., et al. (2005). Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1), 91–134. Available at: http://www.sciencedirect.com/science/article/pii/S0004370205000366. Accessed May 26, 2014.
Fader, A., Soderland, S., & Etzioni, O. (2011). Identifying relations for open information extraction. In Proceedings of the Conference on …, (pp. 1535–1545). Available at: http://dl.acm.org/citation.cfm?id=2145432.2145596. Accessed June 5, 2014.
Frank, M., & Eric, M. (2014) RDF primer. Available at: http://www.w3.org/TR/rdf-primer/
Freitag, D., & McCallum, A. (1999). Information extraction with HMMs and shrinkage. In … on machine learning for information extraction. Available at: http://www.aaai.org/Papers/Workshops/1999/WS-99-11/WS99-11-006.pdf. Accessed June 5, 2014.
French, L., et al. (2009). Application and evaluation of automated semantic annotation of gene expression experiments. Bioinformatics (Oxford, England), 25(12), 1543–1549. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2687992&tool=pmcentrez&rendertype=abstract. Accessed September 19, 2014.
Gabrilovich, E., & Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: IJCAI International Joint Conference on Artificial Intelligence (pp. 1606–1611).
Gao, H., Barbier, G., & Goolsby, R. (2011). Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intelligent Systems, 26, 10–14.
García-Sánchez, F., et al. (2011). Applying intelligent agents and semantic web services in eGovernment environments. Expert Systems, p.no–no. Available at: http://doi.wiley.com/10.1111/j.1468-0394.2011.00586.x. Accessed September 19, 2014.
Gheorghiu, C., & Nicolescu, R. (2011). SIGMA-semantIc government mash-up application: Using semantic web technologies to provide access to governmental data. … (ISPDC), 2011 10th …. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6108280. Accessed June 5, 2014.
Goudos, S. K., et al. (2007). Public administration domain ontology for a semantic web services e-government framework 2. In Related work : E-government models and 3. The governance enterprise architecture., (Scc).
Grishman, R. (1996). Message understanding conference-6: A brief history. In Proceedings of COLING, 96.
Gruber, T. R. (1995). Toward principles for the design of ontologies used for knowledge sharing? International Journal of Human-Computer Studies, 43(5–6), 907–928. Available at: http://www.sciencedirect.com/science/article/pii/S1071581985710816. Accessed June 2, 2014.
Hayes, P. F. P.-S. P., & Ian, H. (2004). OWL web ontology language semantics and abstract syntax. Available at: http://www.w3.org/TR/owl-semantics/
Hevner, A., & Chatterjee, S. (2010). Design research in information systems. In Integrated series in information systems. Integrated series in information systems (pp. 9–23). Boston, MA: Springer US.
Hinkelmann, K., Thönssen, B., & Probst, F. (2006). Reference modeling and lifecycle management for e-government services., (Imi).
Hirst, G., & St-Onge, D. (1998). Lexical chains as representations of context for the detection and correction of malapropisms. WordNet: An electronic lexical database, 305, 305–332.
Hoffart, J., et al. (2011). Robust disambiguation of named entities in text, (pp. 782–792). Available at: http://dl.acm.org/citation.cfm?id=2145432.2145521. Accessed June 5, 2014.
Hoxha, J., & Brahaj, A. (2011). Open government data on the web: A semantic approach. Emerging intelligent data and web …. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6076428. Accessed June 5, 2014.
Hreño, J., et al. (2011, April). Integration of government services using semantic technologies. Journal of theoretical and …. Available at: http://www.scielo.cl/scielo.php?pid=S0718-18762011000100010&script=sci_arttext&tlng=pt. Accessed April 2, 2014.
Ji, H., & Grishman, R. (2006). Data selection in semi-supervised learning for name tagging, (pp. 48–55). Available at: http://dl.acm.org/citation.cfm?id=1641408.1641414. Accessed June 5, 2014.
Jiang, J. J., & Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy.arXiv:preprint cmp-lg/9709008.
Kamegai, S. (2002). Toward ontology-based knowledge extraction from biomedical literature. Genome informatics …, 577(2002), 576–577. Available at: http://jsbi2013.sakura.ne.jp/pdfs/journal1/GIW02/GIW02P078.pdf. Accessed July 9, 2013.
Klischewski, R. (2003). Semantic web for e-government. Electronic government. Available at: http://link.springer.com/chapter/10.1007/10929179_52. Accessed June 5, 2014.
Ku, C. H., et al. (2006). Natural language processing and e-government: Crime information extraction from heterogeneous data sources. In The proceedings of the 9th Annual International Digital Government Research Conference (pp. 162–170). ACM International Conference Proceedings Series, ACM Press.
Kulkarni, S. et al. (2009). Collective annotation of Wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD ’09 (p. 457). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=1557019.1557073. Accessed May 28, 2014.
Kuzma, J. (2010, March). Asian Government Usage of Web 2. 0 Social Media. Sites The Journal Of 20th Century Contemporary French Studies (pp. 1–13).
Leacock, C., Miller, G. A., & Chodorow, M. (1998). Using corpus statistics and WordNet relations for sense identification. Computational Linguistics, 24(1), 147–165.
Lin, D. (1998). An information-theoretic definition of similarity. In ICML(Vol. 98, pp. 296–304).
Lin, T., Etzioni, O., & Fogarty, J. (2009). Identifying interesting assertions from the web. In Proceeding of the 18th ACM Conference on Information and Knowledge Management—CIKM ’09 (p. 1787). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=1645953.1646230. Accessed June 5, 2014.
Loutas, N., Lee, D., & Maali, F. (2011). The semantic public service portal (S-PSP). In G. Antoniou (Ed.), ESWC 2011. LNCS 6644 (pp. 227–242). Springer-Verlag. Available at: http://link.springer.com/chapter/10.1007/978-3-642-21064-8_16. Accessed August 19, 2014.
Macintosh, A., Coleman, S., & Schneeberger, A. (2009). eParticipation: The research gaps (pp. 1–11).
Makinen, M., & Wangu Kuira, M. (2008). Social media and postelection crisis in Kenya. The International Journal of Press/Politics, 13(3), 328–335.
Markellos, K., et al. (2007). Semantic web search for e-government: The case study of intrastat 1 introduction. Journal of Internet Technology, 8(4), 457–468.
Marshall, C. C., & Shipman, F. M. (2003). Which semantic web? In Proceedings of the Fourteenth ACM Conference on Hypertext and Hypermedia—HYPERTEXT ’03 (p. 57). New York, New York, USA: ACM Press. Available at: http://dl.acm.org/citation.cfm?id=900051.900063. Accessed June 5, 2014.
McCallum, A., & Li, W. (2003). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 (pp. 188–191). Morristown, NJ, USA: Association for Computational Linguistics. Available at: http://dl.acm.org/citation.cfm?id=1119176.1119206. Accessed June 5, 2014.
McGuinness, D. L., & van Harmelen, F. (2004). OWL web ontology language overview. Available at: http://www.w3.org/TR/owl-features/
Mccrae, J., et al. (2012). Interchanging lexical resources on the semantic web. Language Resources and Evaluation, 46(4), 701–719.
Mccrae, J., Spohr, D., & Cimiano, P. (2011). Linking lexical resources and ontologies on the semantic web with lemon. In ESWC’11 Proceedings of the 8th Extended Semantic Web Conference—Volume Part I (pp. 245–259).
Medjahed, B., Bouguettaya, A., & Ouzzani, M. (2003). Semantic web enabled e-government services. In Proceedings of the 2003 …. Available at: http://dl.acm.org/citation.cfm?id=1123287. Accessed June 5, 2014.
Miller, S., Crystal, M., & Fox, H. (1998). Algorithms that learn to extract information BBN: Description of the sift system as used for MUC-7. … In Conference (MUC-7). Available at: http://aclweb.org/anthology//M/M98/M98-1009.pdf. Accessed June 5, 2014.
Misuraca, G., Broster, D., & Centeno, C. (2012). Digital Europe 2030: Designing scenarios for ICT in future governance and policy making. Government Information Quarterly, 29, S121–S131. Available at: http://www.sciencedirect.com/science/article/pii/S0740624X11000724. Accessed June 5, 2014.
Moreno-ortiz, A., & Hernández, C. P. (2013). Lexicon—Based sentiment analysis of twitter messages in Spanish (pp. 93–100).
Nadeau, D., & Sekine, S. (2007). A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1), 3–26. Available at: http://www.ingentaconnect.com/content/jbp/li/2007/00000030/00000001/art00002. Accessed June 3, 2014.
Nam, T. (2013). Government 3.0 in Korea: Fad or fashion? In Proceedings of the 7th International Conference on …. Available at: http://dl.acm.org/citation.cfm?id=2591896. Accessed August 19, 2014.
Ojo, A., Estevez, E., & Janowski, T. (2010). Semantic interoperability architecture for Governance 2.0. Information Polity, 15(1), 105–123.
Ortiz-rodr, F., & Villaz, B. (2006). Legal Ontologies for the Spanish e-Government (pp. 301–310).
O’reilly, T. (2007). What is web 2.0: Design patterns and business models for the next generation of software. Communications & Strategies, 65(4578), 17–37.
Panopoulou, E., Tambouris, E. & Tarabanis, K. (2010). eParticipation initiatives in Europe : Learning from practitioners. Ifip International Federation for Information Processing (pp. 54–65).
Peffers, K., et al. (2007). A design science research methodology for information systems research. Journal of Management Information Systems, 24(3), 45–77.
Phil, A., Stijn, G., & Nikolaos, L. (2013). Core public service vocabulary specification. Available at: https://joinup.ec.europa.eu/sites/default/files/72/94/04/D5.1.2—Core Public Service Vocabulary specification v0.05.pdf. Accessed April 10, 2015.
Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. arXiv:https://preprint cmplg/ 9511007
Reuter, C., & Marx, A. (2011, May). Social software as an infrastructure for crisis management—A case study about current practice and potential usage. In Proceedings of the 8th International ISCRAM Conference (pp. 1–10).
Schreiber, R., & Swick, G. (2006). Semantic web best practices and deployment working group. Available at: http://www.w3.org/2001/sw/BestPractices/
Sheth, A., et al. (2002). Managing semantic content for the Web. IEEE Internet Computing, 6(4), 80–87. Available at: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1020330. Accessed June 5, 2014.
Skounakis, M., Craven, M., & Ray, S. (2003). Hierarchical hidden markov models for information extraction. IJCAI. Available at: http://papercut.googlecode.com/hg-history/98464ac0efb47c55159b313c89b0b305ba1d83f9/PaperCutTesting/targetPDF/success/hhmm.pdf. Accessed June 5, 2014.
Small, S., & Medsker, L. (2013). Review of information extraction technologies and applications. Neural computing and applications. Available at: http://link.springer.com/article/10.1007/s00521-013-1516-6. Accessed June 5, 2014.
Soderland, S. (1999). Learning information extraction rules for semi-structured and free text. Machine learning, 34(1–3), 233–272. Available at: http://link.springer.com/article/10.1023/A:1007562322031. Accessed June 5, 2014.
Stadlhofer, B., Salhofer, P., & Tretter, G. (2009). Ontology driven E-government. In 2009 Fourth International Conference on Systems, 7(4), 251–255. Available at: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4976353
Suchanek, F., Kasneci, G., & Weikum, G. (2007). Yago: A core of semantic knowledge. In Proceedings of the 16th … (p. 697). Available at: http://dl.acm.org/citation.cfm?id=1242572.1242667. Accessed June 5, 2014.
Tsai, T. -M., et al. (2003). Ontology-mediated integration of intranet web services. Computer, 36(10), 63–71. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1236473. Accessed June 5, 2014.
Uschold, M., & Gruninger, M. (2009). Ontologies: Principles, methods and applications. The Knowledge Engineering Review, 11(02), 93. Available at: http://journals.cambridge.org/abstract_S0269888900007797. Accessed June 5, 2014.
Vossen, P., & Rambousek, A. (2008). A distributed database system for developing ontological and lexical resources in harmony. In Computational linguistics and intelligent text processing (pp. 1–15).
Wu, Z., & Palmer, M. (1994). Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics (pp. 133–138). Association for Computational Linguistics.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
A. Hassan, I., Ojo, A., Porwol, L. (2015). A Lexical Resource for Identifying Public Services Names on the Social Web. In: Nepal, S., Paris, C., Georgakopoulos, D. (eds) Social Media for Government Services. Springer, Cham. https://doi.org/10.1007/978-3-319-27237-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-27237-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27235-1
Online ISBN: 978-3-319-27237-5
eBook Packages: Computer ScienceComputer Science (R0)