Keyword Based Identification of Thrust Area Using MapReduce for Knowledge Discovery

  • Nirmal KaurEmail author
  • Manmohan Sharma
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 712)


Keyword based identification generally used in many applications like Web pages, Query processing, Searching interfaces with dealing the power of data mining algorithms which contributes effective and efficient work in large datasets. Keywords are most important terms in documents or text fields to get some interesting knowledge for fulfill the discovery goal. The goal of this paper is to specify the Thrust Area for particular searched keyword in computer science field by this interface. This paper use MapReduce framework with some modification and search the keyword from database to identify the Thrust Area. The proposed interface is mapped on the processed query resulting in the relevant information extracted from the given datasets. MapReduce can work with keywords in large datasets such as sorting, counting frequency etc. with high efficiency. Experimental work has also been carried out to analyses the performance on various parameters such as the time taken by each input source to make clusters and identify Thrust Areas.


Knowledge discovery Keyword matching Keyword-based identification MapReduce Information retrieval 


  1. 1.
    Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.: Data Mining: A Knowledge Discovery Approach. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  2. 2.
    Ramagari, B.M.: Data mining techniques and application. Indian J. Comput. Sci. Eng. (2011)Google Scholar
  3. 3.
    Feldman, R., Dagan, I.: Knowledge discovery in textual databases (KDT). In: Proceedings of KDD, vol. 95 (1995)Google Scholar
  4. 4.
    Mcgarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. (2005). Cambridge University PressGoogle Scholar
  5. 5.
    Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. In: Spaccapietra, S. (ed.) Journal on Data Semantics IV. LNCS, vol. 3730, pp. 146–171. Springer, Heidelberg (2005). doi: 10.1007/11603412_5 CrossRefGoogle Scholar
  6. 6.
    Agrawal, S., Chaudhari, S., Das, G.: DBXplorer: a system for keyword-based search over relational databases. In: ICDE IEEE (2002)Google Scholar
  7. 7.
    Balmin, A., Hristidis, V., Papakonstantinou, Y.: ObjectRank: authority-based keyword search in databases. In: VLDB (2004)Google Scholar
  8. 8.
    Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword searching and browsing in databases using BANKS. In: ICDE (2002)Google Scholar
  9. 9.
    Yu, B., Li, G., Sollins, K.: Effective keyword based selection of relational database. In: SIGMOD (2007)Google Scholar
  10. 10.
    Kalesha, P., Rao, M., Kavitha, C.: Efficient preprocessing and patterns identification approach for text mining. Int. J. Comput. Trends Technol. (2011)Google Scholar
  11. 11.
    Sarda, N.L., Jain, A.: A system of keyword based and searching in databases. (2001)
  12. 12.
    Beil, F., Ester, M., Xu, X.: Frequent term-based text clustering. In: SIGKDD (2002)Google Scholar
  13. 13.
    Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Google Inc. (2004)Google Scholar
  14. 14.
    Maclean, D.: A very brief introduction to MapReduce, for CS448G (2011)Google Scholar
  15. 15.
    Hulgeri, A., Bhalotia, G., Nakhrey, C., Chakrabarti, S.: Keyword search in databases. In: Bulletin of the IEEE Computer Society Technical Committee on Data Engineering (2001)Google Scholar
  16. 16.
    Agarwal, S., Chaudhari, S., Das, G.: DBExplorer: a system for keyword-based search over relational databases. In: Proceedings of the 18th International Conference with Hashing and Other Known Compression Techniques (2002)Google Scholar
  17. 17.
    Hristidis, V., Papakonstantinou, Y.: DISCOVER: keyword search in relational databases. In: Proceedings of the 28th VLDB Conference (2002)Google Scholar
  18. 18.
    Agichtein, E., Gravano, L.: Querying text databases for efficient information extraction. In: Proceedings of the IEEE ICDE (2003)Google Scholar
  19. 19.
    Su, Q., Widom, J.: Indexing relational database content offline for efficient keyword based search. In: International Database Engineering & Application Symposium (2005)Google Scholar
  20. 20.
    Chaudhari, S., Das, G.: Keyword querying and ranking in databases. In: Proceedings of the VLDB Endowment (2009)Google Scholar
  21. 21.
    Qin, Z., Li, P.: SWEE: approximately searching web service with keywords effectively and efficiently. © IEEE (2010)Google Scholar
  22. 22.
    Li, L., Petschulat, S.: Efficient and effective aggregate keyword search on rational databases. Int. J. Data Warehous. Min. (2012)Google Scholar
  23. 23.
    Uthayan, K.R., Anandha, V.: Hybrid ontology for semantic information retrieval model using keyword matching indexing system. Res. Artic.@ Sci. World J. (2015)Google Scholar
  24. 24.
    Sun, T., Shu, C.: An efficient hierarchical clustering method for large datasets with Map-Reduce. In: International Conference on Parallel and Distributed Computing, Application and Technologies (2009)Google Scholar
  25. 25.
    Rao, P.S., Prasad, M.H.M.K., Reddy, K.T.: An efficient semantic ranked keyword search of big data using Map Reduce. Int. J. Database Theory Appl. (2015)Google Scholar
  26. 26.
    Hao, Y., Cao, H.: Efficient keyword search on graphs using MapReduce. In: IEEE International Conference on Big Data (2015)Google Scholar
  27. 27.
    Swaraj, K.P., Manjula, D.: A fast approach to identify articles in hot topics from XML based big bibliographic datasets. Cluster Comput. 19, 837–848 (2016). doi: 10.1007/s10586-016-0561-1 CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.School of Computer ApplicationLovely Professional UniversityJalandharIndia

Personalised recommendations