Skip to main content

Hashtag# Perspicacity of India Region Using Scalable Big Data Infrastructure Using Hadoop Environment

  • Chapter
  • First Online:
Social Networks Science: Design, Implementation, Security, and Challenges

Abstract

Social media is rapidly providing new standards of interaction between individuals in the recent era. It is known as a computer-intermediated tool that tolerates people to share, or create information, medical notes/reports, ideas, and pictures/videos through virtual communities. The online citizens (netizens) are able to create a colossal network of people to communicate with. Social media such as Facebook, Twitter, Youtube, Instagram and Tinder have prevalent uses that produce copious amount of data which is beyond the ability of normal software tools to process in the given elapsed time. Apache Hadoop project is the most famous open sourced frameworks for large scale computation on the commodity hardware. Hadoop has become kernel for distributed operating system for big data. There are two core components associated with Hadoop–Hadoop Distributed File System (HDFS) and MapReduce. MapReduce distributed the tasks on multiple nodes in the cluster, the developer only have to write code rest is taken care by MapReduce. The generated data from these social sources is real time and includes information about author’s daily activities, feelings and emotions. The messages often include images, geo-locations and many other annotations. This vast data repository provides researchers with opportunities to study the individuals’ behavior/emotions that subject to different conditions. In this chapter we will find the trending tweets from January 2017 to September 2017 depending upon the eight prominent themes that emerged from the data set and trending tweets depending upon the geolocation. For this we divide India depending upon the region that is North India, West India, South India, East India, Central India, Northeast India

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Jarrah, O. Y., Yoo, P. D., Muhaidat, S., Karagiannidis, G. K., & Taha, K. (2015). Efficient machine learning for Big data: A review. Big Data Research, 2(3), 87–93.

    Article  Google Scholar 

  2. Aye, K. N., & Thein, T. (2015). A platform for big data analytics on distributed scale-out storage system. International Journal of Big Data, Intelligence, 2(2), 127–141.

    Article  Google Scholar 

  3. Barker, K. J., Amato, J., & Sheridon, J. (2008). Credit card fraud: Awareness and prevention. Journal of Financial Crime, 15(4), 398–410.

    Article  Google Scholar 

  4. Bhanu, S. K., & Tripathy, B. K. (2016). Rough set based similarity measures for data analytics in spatial epidemiology. International Journal of Rough Seta and Data Analysis, 3(1), 114–123.

    Article  Google Scholar 

  5. Carr, C. T., & Hayes, R. A. (2015). Social media: defining, developing, and divining. Atlantic Journal of Communication, 23(1), 46–65.

    Article  Google Scholar 

  6. Chandra, S., Khan, L., Muhaya, F. B. (2011). Estimating twitter user location using social interactions–a content based approach. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT), and 2011 IEEE Third International Conference on Social Computing (SOCIALCOM) (pp. 838–843). IEEE.

    Google Scholar 

  7. Chen, M., Mao, S., & Liu, Y. (2009). Big Data: A survey. Springer-Mobile Networks and Applications, 19(2), 171–209.

    Article  Google Scholar 

  8. Chen, H., Chiang, R., & Storey, V. (2012). Busines intelligence and analytics: From Big Data to big impact. MIS Quaterly, 36(4), 1–10.

    Google Scholar 

  9. Chen, X., Vorvoreanu, M., & Madhavan, K. (2014). Mining social media data for understanding student’s learning experience. IEEE Transaction on Learning Technologies, 7(3), 246–259.

    Article  Google Scholar 

  10. Cheng, Z., Caverlee, J., & Lee, K. (2010). You are where you tweet: a content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (pp. 759–768). ACM.

    Google Scholar 

  11. Davis Jr, C. A., Pappa, G. L., de Oliveira, D. R. R., & de L Arcanjo, F. (2011). Inferring the location of twitter messages based on user relationships. Transactions in GIS, 15(6), 735–751.

    Article  Google Scholar 

  12. Deepak, D., & John, S. J. (2016). Information systems on hesitant fuzzy sets. International Journal of Rough Seta and Data Analysis, 3(1), 55–70.

    Article  Google Scholar 

  13. Fedoryszak, M.,Tkaczyk, D. and Bolikowski,L. (2013). Large scale citation matching using Apache Hadoop. Springer-Research and Advanced Technology for Digital Libraries Lecture Notes in Computer Science (Vol. 8092, pp. 362–365).

    Chapter  Google Scholar 

  14. Gelernter, J., & Mushegian, N. (2011). Geo-parsing messages from microtext. Transactions in GIS, 15(6), 753–773.

    Article  Google Scholar 

  15. González-Vélez, H., & Kontagora, M. (2011). Performance evaluation of MapReduce using full virtualisation on a departmental cloud. International Journal of Applied Mathematics and Computer Science, 21(2), 275–284.

    Article  Google Scholar 

  16. Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information System, 47, 98–115.

    Article  Google Scholar 

  17. Hassanien, A. E., Azar, A. T., Snasel, V., Kacprzyk, J., & Abawajy, J. H. (2015). Big Data in complex systems: Challenges and opportunities. Studies in Big Data (Vol. 9). Berlin/Heidelberg: Verlag GmbH, Springer.

    Google Scholar 

  18. Hays, R., & Daker-White, G. (2015). The care.data consensus? A qualitative analysis of opinions expressed on Twitter. BMC Public Health, 15(1), 1.

    Article  Google Scholar 

  19. Hecht, B., Hong, L., Suh, B., & Chi, E. H. (2011). Tweets from justin bieber’s heart: The dynamics of the location field in user profiles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 237–246). ACM.

    Google Scholar 

  20. Hoffmann, A. O. I., & Birnbrich, C. (2012). The impact of fraud prevention on bank-customer relationships. International Journal of Bank Marketing., 30(5), 390–407.

    Article  Google Scholar 

  21. Huang, T., Lan, L., Fang, X., An, P., Min, J., & Wang, F. (2015). Promises and challenges of big data computing in health science. Big Data Research, 2(1), 2–11.

    Article  Google Scholar 

  22. Ibrahim, S., Jin, H., Lu, L., Qi, L., Wu, S., & Shi, X. (2009). Evaluating MapReduce on virtual machines: The Hadoop Case. Springer: Cloud Computing. Lecture Notes in Computer Science (Vol. 5931, pp. 519–528).

    Google Scholar 

  23. Jacobs, A. (2009). The pathologies of big data. Communications of the ACM—A Blind Person’s Interaction with Technology, 52(8), 36–44.

    Google Scholar 

  24. Jagadish, H. V. (2015). Big Data and science: Myths and reality. Big Data Research, 2(2), 49–52.

    Article  MathSciNet  Google Scholar 

  25. Jin, X., Wah, B. W., Cheng, X., & Wang, Y. (2015). Significance and challenges of big data research. Big Data Research, 2(2), 59–64.

    Article  Google Scholar 

  26. Khanwalkar, S., Seldin, M., Srivastava, A., Kumar, A., & Colbath, S. (2013, September). Content-based geo-location detection for placing tweets pertaining to trending news on map. In The Fourth International Workshop on Mining Ubiquitous and Social Environments (p. 37).

    Google Scholar 

  27. Kolomvatsos, K., Anagnostopoulos, C., & Hadjiefthymiades, S. (2015). An efficient time optimized scheme for progressive analytics in Big Data. Big Data Research, 2(4), 155–165.

    Article  MATH  Google Scholar 

  28. Labrinidis, A., & Jagadish, H. V. (2012). Challenges and opportunities with big data.ACM-. Proceedings of the VLDB Endowment, 5(12), 2032–2033.

    Article  Google Scholar 

  29. Lee, Y. (2013). Toward scalable internet traffic measurement and analysis with Hadoop. ACM SIGCOMM Computer Communication, 43(1), 5–13.

    Article  Google Scholar 

  30. Mahmud, J., Nichols, J., & Drews, C. (2014). Home location identification of twitter users. arXiv:1403.2345.

  31. Ryan, T., & Lee, Y. C. (2015). Multi-tier resource allocation for data-intensive computing. Big Data Research, 2(3), 110–116.

    Article  Google Scholar 

  32. Samanta, S., Acharjee, S., Mukherjee, A., Das, D., & Dey, N. (2013). Ant Weight Lifting algorithm for image segmentation. In IEEE International Conference on Computational Intelligence and Computing Research (pp. 1–5).

    Google Scholar 

  33. Shabeera, T. P., & Madhu Kumar, S. D. (2015). Optimizing virtual machine allocation in MapReduce cloud for improved data locality. International Journal of Big Data Intelligence, 2(1), 2–8.

    Article  Google Scholar 

  34. Srivastava, U., & Gopalkrishnan, S. (2015). Impact of Big Data analytics on banking sector: Learning for Indian Bank. Big Data, Cloud and Computing Challenges, 50, 643–652.

    Google Scholar 

  35. Terry, M. (2009). Twittering healthcare: Social media and medicine. Telemedicine and e-Health, 15(6), 507–510.

    Article  Google Scholar 

  36. Tiwari, P. K., & Joshi, S. (2015). Data security for software as a service. International Journal of Service Science, Management, Engineering, and Technology, 6(3), 47–63.

    Article  Google Scholar 

  37. Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM, 10, 178–185.

    Google Scholar 

  38. Wahi, A. K., Medury, Y., & Misra, R. K. (2014). Social Media: The core of enterprise 2.0. International Journal of Service Science, Management, Engineering, and Technology, 5(3), 1–15.

    Article  Google Scholar 

  39. Wahi, A. K., Medury, Y., & Misra, R. K. (2015). Big Data: Enabler or Challenge for Enterprise 2.0. International Journal of Service Science, Management, Engineering, and Technology, 6(2), 1–17.

    Article  MathSciNet  Google Scholar 

  40. Zhang, W., & Gelernter, J. (2014). Geocoding location expressions in twitter messages: A preference learning method. Journal of Spatial Information Science, 2014(9), 37–70.

    Google Scholar 

  41. Zikopoulos, P., & Eaton, C. (2011). Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Ambedkar Institute of Advanced Communication Technologies and Research for providing us the infrastructure for carrying out the research work efficiently.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arushi Jain .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Jain, A., Bhatnagar, V. (2018). Hashtag# Perspicacity of India Region Using Scalable Big Data Infrastructure Using Hadoop Environment. In: Dey, N., Babo, R., Ashour, A., Bhatnagar, V., Bouhlel, M. (eds) Social Networks Science: Design, Implementation, Security, and Challenges . Springer, Cham. https://doi.org/10.1007/978-3-319-90059-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-90059-9_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-90058-2

  • Online ISBN: 978-3-319-90059-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics