Skip to main content

Effectiveness of Recent Research Approaches in Natural Language Processing on Data Science-An Insight

  • Conference paper
  • First Online:
Computational and Statistical Methods in Intelligent Systems (CoMeSySo 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 859))

Included in the following conference series:

Abstract

With the exponentially increasing size and complexity of the data in present time, data quality has become a major concern with respect to data analytics. The potential capability of Natural Language Processing (NLP) is already known and being harnessed by various researchers to evolve up with some significant analytical process. However, there is less number of research works emphasizing on applying NLP over the data with complexity reported in current times in the area of big data. Therefore, the primary contribution of this manuscript is to review the most recent work towards NLP based approaches for data analysis where input data could be either text or non-textual too. The secondary contribution is to gauge the level of effectiveness from the existing research approach with NLP-based practices towards leveraging better data quality in data science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kurdi, Z.: Natural Language Processing and Computational Linguistics 2: Semantics, Discourse and Applications, vol. 2. Wiley, Hoboken (2018)

    Google Scholar 

  2. Lane, H., Howard, C., Hapke, H.: Natural Language Processing in Action. Manning Publications, Shelter Island (2018)

    Google Scholar 

  3. Ardagna, C.A., Ceravolo, P., Damiani, E.: Big data analytics as-a-service: issues and challenges. In: 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, pp. 3638–3644 (2016)

    Google Scholar 

  4. Niño, M., Blanco, J.M., Illarramendi, A.: Business understanding, challenges and issues of Big Data Analytics for the servitization of a capital equipment manufacturer. In: 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, pp. 1368–1377 (2015)

    Google Scholar 

  5. Shuijing, H.: Big data analytics: key technologies and challenges. In: 2016 International Conference on Robots and Intelligent System (ICRIS), Zhangjiajie, pp. 141–145 (2016)

    Google Scholar 

  6. Barros, V.P., Notargiacomo, P.: Big data analytics in cloud gaming: players’ patterns recognition using artificial neural networks. In: 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, pp. 1680–1689 (2016)

    Google Scholar 

  7. Barga, R.S., Ekanayake, J., Lu, W.: Project Daytona: data analytics as a cloud service. In: 2012 IEEE 28th International Conference on Data Engineering, Washington, DC, pp. 1317–1320 (2012)

    Google Scholar 

  8. Schmid, S., Gerostathopoulos, I., Prehofer, C., Bures, T.: Self-adaptation based on big data analytics: a model problem and tool. In: 2017 IEEE/ACM 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), Buenos Aires, pp. 102–108 (2017)

    Google Scholar 

  9. Makki, S., et al.: Fraud data analytics tools and techniques in Big Data era. In: 2017 International Conference on Cloud and Autonomic Computing (ICCAC), Tucson, AZ, pp. 186–187 (2017)

    Google Scholar 

  10. Schmid, S., Gerostathopoulos, I., Prehofer, C.: QryGraph: a graphical tool for Big Data analytics. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Budapest, pp. 004028–004033 (2016)

    Google Scholar 

  11. Grolinger, K., Hayes, M., Higashino, W.A., L’Heureux, A., Allison, D.S., Capretz, M.A.M.: Challenges for MapReduce in Big Data. In: 2014 IEEE World Congress on Services, Anchorage, AK, pp. 182–189 (2014)

    Google Scholar 

  12. Jayasingh, B.B., Patra, M.R., Mahesh, D.B.: Security issues and challenges of big data analytics and visualization. In: 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I), Noida, pp. 204–208 (2016)

    Google Scholar 

  13. Liu, Q., Ribeiro, B., Sung, A.H., Suryakumar, D.: Mining the Big Data: the critical feature dimension problem. In: 2014 IIAI 3rd International Conference on Advanced Applied Informatics, Kitakyushu, pp. 499–504 (2014)

    Google Scholar 

  14. Alam, A., Ahmed, J.: Hadoop architecture and its issues. In: 2014 International Conference on Computational Science and Computational Intelligence, Las Vegas, NV, pp. 288–291 (2014)

    Google Scholar 

  15. Hunckle, M., Article: This open-source AI voice assistant is challenging Siri and Alexa for market superiority. https://www.forbes.com/sites/matthunckler/2017/05/15/this-open-source-ai-voice-assistant-is-challenging-siri-and-alexa-for-market-superiority/#ed2d9e63ec01

  16. Guiu, J.M.: Using latent semantic analyses and propositionalist methods in text comprehension. In: 2017 Computing Conference, London, pp. 187–191 (2017)

    Google Scholar 

  17. Geng, R., Jian, P., Zhang, Y., Huang, H.: Implicit discourse relation identification based on tree structure neural network. In: 2017 International Conference on Asian Language Processing (IALP), Singapore, pp. 334–337 (2017)

    Google Scholar 

  18. Punuru, J., Chen, J.: Learning taxonomical relations from domain texts using WordNet and word sense disambiguation. In: 2012 IEEE International Conference on Granular Computing, Hangzhou, China, pp. 382–387 (2012)

    Google Scholar 

  19. Cabezudo, M.A.S., Palomino, N.L.S., Perez, R.M.: Improving subjectivity detection for Spanish texts using subjectivity word sense disambiguation based on knowledge. In: 2015 Latin American Computing Conference (CLEI), Arequipa, pp. 1–7 (2015)

    Google Scholar 

  20. Shi, Z.: The design and implementation of domain-specific text summarization system based on co-reference resolution algorithm. In: 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery, Yantai, Shandong, pp. 2390–2394 (2010)

    Google Scholar 

  21. Sleeman, J., Finin, T.: Type prediction for efficient coreference resolution in heterogeneous semantic graphs. In: 2013 IEEE Seventh International Conference on Semantic Computing, Irvine, CA, pp. 78–85 (2013)

    Google Scholar 

  22. Eletriby, M.R., Reynolds, T.L., Jain, R., Zheng, K.: Investigating named entity recognition of contextual information in online consumer health text. In: 2017 Eighth International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, pp. 396–402 (2017)

    Google Scholar 

  23. Yang, P., Chen, Y.: A survey on sentiment analysis by using machine learning methods. In: 2017 IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chengdu, pp. 117–121 (2017)

    Google Scholar 

  24. Tang, Y., Wu, X.: Scene text detection using superpixel based stroke feature transform and deep learning based region classification. In IEEE Transactions on Multimedia

    Google Scholar 

  25. Zhu, F., Liu, Q., Zhang, X., Shen, B.: Protein interaction network constructing based on text mining and reinforcement learning with application to prostate cancer. IET Syst. Biol. 9(4), 106–112 (2015)

    Article  Google Scholar 

  26. Ali, I., Melton, A.: Semantic-based text document clustering using cognitive semantic learning and graph theory. In: 2018 IEEE 12th International Conference on Semantic Computing (ICSC), Laguna Hills, CA, pp. 243–247 (2018)

    Google Scholar 

  27. Tulu, C., Orhan, U.: PageRank based semantic similarity measure on a graph based Turkish WordNet. In: 2017 International Conference on Computer Science and Engineering (UBMK), Antalya, pp. 468–473 (2017)

    Google Scholar 

  28. Liu, H., Komandur, R., Verspoor, K.: From graphs to events: a subgraph matching approach for information extraction from biomedical text. In: Proceedings of the BioNLP Shared Task 2011 Workshop, pp. 164–172 (2011)

    Google Scholar 

  29. Al-Zaidy, R.A., Giles, C.L.: Extracting semantic relations for scholarly knowledge base construction. In: 2018 IEEE 12th International Conference on Semantic Computing (ICSC), Laguna Hills, CA, pp. 56–63 (2018)

    Google Scholar 

  30. Zhao, G., Zhang, X.: A domain-specific web document re-ranking algorithm. In: 2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), Hamamatsu, pp. 385–390 (2017)

    Google Scholar 

  31. Fulda, J., Brehmel, M., Munzner, T.: TimeLineCurator: interactive authoring of visual timelines from unstructured text. IEEE Trans. Visual Comput. Graph. 22(1), 300–309 (2016)

    Article  Google Scholar 

  32. Nafari, M., Weaver, C.: Query2Question: translating visualization interaction into natural language. IEEE Trans. Visual Comput. Graph. 21(6), 756–769 (2015)

    Article  Google Scholar 

  33. Ramisa, A., Yan, F., Moreno-Noguer, F., Mikolajczyk, K.: BreakingNews: article annotation by image and text processing. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1072–1085 (2018)

    Article  Google Scholar 

  34. Ki, W., Kim, K.: Generating information relation matrix using semantic patent mining for technology planning: a case of nano-sensor. IEEE Access 5, 26783–26797 (2017)

    Article  Google Scholar 

  35. Poria, S., Cambria, E., Gelbukh, A., Bisio, F., Hussain, A.: Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput. Intell. Mag. 10(4), 26–36 (2015)

    Article  Google Scholar 

  36. Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)

    Article  Google Scholar 

  37. Vioulès, M.J., Moulahi, B., Azé, J., Bringay, S.: Detection of suicide-related posts in Twitter data streams. IBM J. Res. Dev. 62(1), 7:1–7:12 (2018)

    Article  Google Scholar 

  38. Qiu, L., Lei, Q., Zhang, Z.: Advanced sentiment classification of tibetan microblogs on smart campuses based on multi-feature fusion. IEEE Access 6, 17896–17904 (2018)

    Article  Google Scholar 

  39. Yu, L.C., Wang, J., Lai, K.R., Zhang, X.: Refining word embeddings using intensity scores for sentiment analysis. IEEE/ACM Trans. Audio Speech Lang. Process. 26(3), 671–681 (2018)

    Article  Google Scholar 

  40. Salas, J.: Generating music from literature using topic extraction and sentiment analysis. IEEE Potentials 37(1), 15–18 (2018)

    Article  Google Scholar 

  41. Fang, Y., Tan, H., Zhang, J.: Multi-strategy sentiment analysis of consumer reviews based on semantic fuzziness. IEEE Access 6, 20625–20631 (2018)

    Article  Google Scholar 

  42. Sahare, P., Dhok, S.B.: Multilingual character segmentation and recognition schemes for indian document images. IEEE Access 6, 10603–10617 (2018)

    Article  Google Scholar 

  43. Rodriguez, T., Aguilar, J.: Knowledge extraction system from unstructured documents. IEEE Latin Am. Trans. 16(2), 639–646 (2018)

    Article  Google Scholar 

  44. Hassan, A., Mahmood, A.: Convolutional recurrent deep learning model for sentence classification. IEEE Access 6, 13949–13957 (2018)

    Article  Google Scholar 

  45. Wu, D., Chi, M.: Long short-term memory with quadratic connections in recursive neural networks for representing compositional semantics. IEEE Access 5, 16077–16083 (2017)

    Article  Google Scholar 

  46. Thenmozhi, D., Aravindan, C.: Paraphrase identification by using clause-based similarity features and machine translation metrics. Comput. J. 59(9), 1289–1302 (2016)

    Article  Google Scholar 

  47. Whitehead, N.P., Scherer, W.T., Smith, M.C.: Use of natural language processing to discover evidence of systems thinking. IEEE Syst. J. 11(4), 2140–2149 (2017)

    Article  Google Scholar 

  48. Dilawari, A., Khan, M.U.G., Farooq, A., Rehman, Z.U., Rho, S., Mehmood, I.: Natural language description of video streams using task-specific feature encoding. IEEE Access 6, 16639–16645 (2018)

    Article  Google Scholar 

  49. Etter, D., Domeniconi, C.: Multi2Rank: multimedia multiview ranking. In: 2015 IEEE International Conference on Multimedia Big Data, Beijing, pp. 80–87 (2015)

    Google Scholar 

  50. Huang, Y.T., Tseng, Y.M., Sun, Y.S., Chen, M.C.: TEDQuiz: automatic quiz generation for TED talks video clips to assess listening comprehension. In: 2014 IEEE 14th International Conference on Advanced Learning Technologies, Athens, pp. 350–354 (2014)

    Google Scholar 

  51. Kucuktunc, O., Gudukbay, U., Ulusoy, O.: A natural language-based interface for querying a video database. IEEE Multimed. 14(1), 83–89 (2007)

    Article  Google Scholar 

  52. Pouyanfar, S., Chen, S.C., Shyu, M.L.: An efficient deep residual-inception network for multimedia classification. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, pp. 373–378 (2017)

    Google Scholar 

  53. Wlodarczak, P., Soar, J., Ally, M.: Multimedia data mining using deep learning. In: 2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC), Sierre, pp. 190–196 (2015)

    Google Scholar 

  54. Zhang, D., Nunamaker, J.F.: A natural language approach to content-based video indexing and retrieval for interactive e-learning. IEEE Trans. Multimed. 6(3), 450–458 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Shruthi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shruthi, J., Swamy, S. (2019). Effectiveness of Recent Research Approaches in Natural Language Processing on Data Science-An Insight. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds) Computational and Statistical Methods in Intelligent Systems. CoMeSySo 2018. Advances in Intelligent Systems and Computing, vol 859. Springer, Cham. https://doi.org/10.1007/978-3-030-00211-4_17

Download citation

Publish with us

Policies and ethics