Abstract
Contemporaneous research has strongly indicated that most of the data on the Internet are unstructured data due to the phenomenon that during the input, processing of data and collection and storage of data by almost all the entities involved do not keep the data in a format that complies with a certain structure; this scenario has a domino effect on retrieving information should there be any inquiry. A part and parcel of semantic web area is data extraction and crucial for linking question and answer in the web. Should a question is pitched, it requires semantic analysis of data—both, structured and unstructured, map each part of the answer to the relevance of the question. Information extraction entails a crucial area of natural language processing and without the proper application of data acquisition from really large data set, for instance billions of alphanumeric words—the required data are hardly ever on the receiving end. The practical application, however, certainly needs answers that are succinct, correct and to the point; often times, the readers would skim-read through each answer as they themselves have to decide on which is more accurate to their question. This poses a unique challenge, a scenario where the question is incomplete; the answer is hidden under layers of data, and to make the query even more complex, researchers add the languages that are available. For English, a lot of researches have been conducted and due to the exceptional amount of usage among all the entities alike, English language has passed the initial issues and has been producing nearly ninety-nine percent accurate data. That is not the case for Bengali semantic analysis, and deriving meaningful information has been a challenge. This paper proposes a decisive algorithm to acquire meaningful and relevant data from unstructured data. The exactitude and efficiency of target data extraction depend on reasoning and analysis of unstructured data. Here, Universal Networking Language (UNL) has been applied to the proposed method to bring out the desired output. In this method, exceptionally large data sets that are unstructured have been categorized in prespecified relation with the help of UNL, and on these relations, every word of a sentence has been compared in binary relation. Finally, the proposed method extracts information from these binary relations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Khan, E.: Machine learning algorithms for natural language semantics and cognitive computing. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1146–1151. IEEE (2016)
Jiang, J.: Information extraction from text. In: Mining Text Data, pp. 11–41. Springer (2012)
Barkschat, K.: Semantic information extraction on domain specific data sheets. In: European Semantic Web Conference, pp. 864–873. Springer (2014)
Akbik, A., Bro, J.: Wanderlust: extracting semantic relations from natural language text using dependency grammar patterns. In: www workshop, vol. 48 (2009)
Mooney, R.J., Bunescu, R.: Mining knowledge from text using information extraction. ACM SIGKDD Explor. Newsl. 7(1), 3–10 (2005)
Duma, D., Klein, E.: Generating natural language from linked data: unsupervised template extraction. In: Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)–Long Papers, pp. 83–94 (2013)
Heath, T., Bizer, C.: Linked data: evolving the web into a global data space. Synth. Lect. Semant. Web: Theory Technol. 1(1), 1–136 (2011)
Saha, A.K., Mridha, M., Rafiq, J.I., Das, J.K.: Data extraction from natural language using universal networking language. In: International Conference on Current Trends in Computer, Electrical, Electronics and Communication (ICCTCEEC), 8–9 Sept 2017
Saha, A.K., Akhtar, S., Mridha, M.F., Das, J.K.: Attribute analysis for Bangla words for universal networking language (UNL). Editorial Preface 4(1) (2013)
Uchida, H., Zhu, M., Della Senta, T.: UNL: A Gift for a Millennium. The United Nations University (2000)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 423–430. Association for Computational Linguistics (2003)
Gagnon, M., Da Sylva, L.: Text compression by syntactic pruning. In: Conference of the Canadian Society for Computational Studies of Intelligence, pp. 312–323. Springer (2006)
Cohn, T.A., Lapata, M.: Sentence compression as tree transduction. J. Artif. Intell. Res. 34, 637–674 (2009)
Peng, F., McCallum, A.: Accurate information extraction from research papers using conditional random fields. Retrieved on 13 Apr 2013
Chen, J., Chen, H.: A structured information extraction algorithm for scientific papers based on feature rules learning. JSW 8(1), 55–62 (2013)
Filippova, K., Strube, M.: Dependency tree based sentence compression. In: Proceedings of the Fifth International Natural Language Generation Conference, pp. 25–32. Association for Computational Linguistics (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Saha, A.K., Mridha, M.F., Rafiq, J.I., Das, J.K. (2019). Information Extraction from Natural Language Using Universal Networking Language. In: Bhatia, S., Tiwari, S., Mishra, K., Trivedi, M. (eds) Advances in Computer Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 924. Springer, Singapore. https://doi.org/10.1007/978-981-13-6861-5_24
Download citation
DOI: https://doi.org/10.1007/978-981-13-6861-5_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6860-8
Online ISBN: 978-981-13-6861-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)