Hashtag# Perspicacity of India Region Using Scalable Big Data Infrastructure Using Hadoop Environment

Jain, Arushi; Bhatnagar, Vishal

doi:10.1007/978-3-319-90059-9_4

Arushi Jain⁶ &
Vishal Bhatnagar⁶

745 Accesses
2 Citations

Abstract

Social media is rapidly providing new standards of interaction between individuals in the recent era. It is known as a computer-intermediated tool that tolerates people to share, or create information, medical notes/reports, ideas, and pictures/videos through virtual communities. The online citizens (netizens) are able to create a colossal network of people to communicate with. Social media such as Facebook, Twitter, Youtube, Instagram and Tinder have prevalent uses that produce copious amount of data which is beyond the ability of normal software tools to process in the given elapsed time. Apache Hadoop project is the most famous open sourced frameworks for large scale computation on the commodity hardware. Hadoop has become kernel for distributed operating system for big data. There are two core components associated with Hadoop–Hadoop Distributed File System (HDFS) and MapReduce. MapReduce distributed the tasks on multiple nodes in the cluster, the developer only have to write code rest is taken care by MapReduce. The generated data from these social sources is real time and includes information about author’s daily activities, feelings and emotions. The messages often include images, geo-locations and many other annotations. This vast data repository provides researchers with opportunities to study the individuals’ behavior/emotions that subject to different conditions. In this chapter we will find the trending tweets from January 2017 to September 2017 depending upon the eight prominent themes that emerged from the data set and trending tweets depending upon the geolocation. For this we divide India depending upon the region that is North India, West India, South India, East India, Central India, Northeast India

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Al-Jarrah, O. Y., Yoo, P. D., Muhaidat, S., Karagiannidis, G. K., & Taha, K. (2015). Efficient machine learning for Big data: A review. Big Data Research, 2(3), 87–93.
Article Google Scholar
Aye, K. N., & Thein, T. (2015). A platform for big data analytics on distributed scale-out storage system. International Journal of Big Data, Intelligence, 2(2), 127–141.
Article Google Scholar
Barker, K. J., Amato, J., & Sheridon, J. (2008). Credit card fraud: Awareness and prevention. Journal of Financial Crime, 15(4), 398–410.
Article Google Scholar
Bhanu, S. K., & Tripathy, B. K. (2016). Rough set based similarity measures for data analytics in spatial epidemiology. International Journal of Rough Seta and Data Analysis, 3(1), 114–123.
Article Google Scholar
Carr, C. T., & Hayes, R. A. (2015). Social media: defining, developing, and divining. Atlantic Journal of Communication, 23(1), 46–65.
Article Google Scholar
Chandra, S., Khan, L., Muhaya, F. B. (2011). Estimating twitter user location using social interactions–a content based approach. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT), and 2011 IEEE Third International Conference on Social Computing (SOCIALCOM) (pp. 838–843). IEEE.
Google Scholar
Chen, M., Mao, S., & Liu, Y. (2009). Big Data: A survey. Springer-Mobile Networks and Applications, 19(2), 171–209.
Article Google Scholar
Chen, H., Chiang, R., & Storey, V. (2012). Busines intelligence and analytics: From Big Data to big impact. MIS Quaterly, 36(4), 1–10.
Google Scholar
Chen, X., Vorvoreanu, M., & Madhavan, K. (2014). Mining social media data for understanding student’s learning experience. IEEE Transaction on Learning Technologies, 7(3), 246–259.
Article Google Scholar
Cheng, Z., Caverlee, J., & Lee, K. (2010). You are where you tweet: a content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (pp. 759–768). ACM.
Google Scholar
Davis Jr, C. A., Pappa, G. L., de Oliveira, D. R. R., & de L Arcanjo, F. (2011). Inferring the location of twitter messages based on user relationships. Transactions in GIS, 15(6), 735–751.
Article Google Scholar
Deepak, D., & John, S. J. (2016). Information systems on hesitant fuzzy sets. International Journal of Rough Seta and Data Analysis, 3(1), 55–70.
Article Google Scholar
Fedoryszak, M.,Tkaczyk, D. and Bolikowski,L. (2013). Large scale citation matching using Apache Hadoop. Springer-Research and Advanced Technology for Digital Libraries Lecture Notes in Computer Science (Vol. 8092, pp. 362–365).
Chapter Google Scholar
Gelernter, J., & Mushegian, N. (2011). Geo-parsing messages from microtext. Transactions in GIS, 15(6), 753–773.
Article Google Scholar
González-Vélez, H., & Kontagora, M. (2011). Performance evaluation of MapReduce using full virtualisation on a departmental cloud. International Journal of Applied Mathematics and Computer Science, 21(2), 275–284.
Article Google Scholar
Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information System, 47, 98–115.
Article Google Scholar
Hassanien, A. E., Azar, A. T., Snasel, V., Kacprzyk, J., & Abawajy, J. H. (2015). Big Data in complex systems: Challenges and opportunities. Studies in Big Data (Vol. 9). Berlin/Heidelberg: Verlag GmbH, Springer.
Google Scholar
Hays, R., & Daker-White, G. (2015). The care.data consensus? A qualitative analysis of opinions expressed on Twitter. BMC Public Health, 15(1), 1.
Article Google Scholar
Hecht, B., Hong, L., Suh, B., & Chi, E. H. (2011). Tweets from justin bieber’s heart: The dynamics of the location field in user profiles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 237–246). ACM.
Google Scholar
Hoffmann, A. O. I., & Birnbrich, C. (2012). The impact of fraud prevention on bank-customer relationships. International Journal of Bank Marketing., 30(5), 390–407.
Article Google Scholar
Huang, T., Lan, L., Fang, X., An, P., Min, J., & Wang, F. (2015). Promises and challenges of big data computing in health science. Big Data Research, 2(1), 2–11.
Article Google Scholar
Ibrahim, S., Jin, H., Lu, L., Qi, L., Wu, S., & Shi, X. (2009). Evaluating MapReduce on virtual machines: The Hadoop Case. Springer: Cloud Computing. Lecture Notes in Computer Science (Vol. 5931, pp. 519–528).
Google Scholar
Jacobs, A. (2009). The pathologies of big data. Communications of the ACM—A Blind Person’s Interaction with Technology, 52(8), 36–44.
Google Scholar
Jagadish, H. V. (2015). Big Data and science: Myths and reality. Big Data Research, 2(2), 49–52.
Article MathSciNet Google Scholar
Jin, X., Wah, B. W., Cheng, X., & Wang, Y. (2015). Significance and challenges of big data research. Big Data Research, 2(2), 59–64.
Article Google Scholar
Khanwalkar, S., Seldin, M., Srivastava, A., Kumar, A., & Colbath, S. (2013, September). Content-based geo-location detection for placing tweets pertaining to trending news on map. In The Fourth International Workshop on Mining Ubiquitous and Social Environments (p. 37).
Google Scholar
Kolomvatsos, K., Anagnostopoulos, C., & Hadjiefthymiades, S. (2015). An efficient time optimized scheme for progressive analytics in Big Data. Big Data Research, 2(4), 155–165.
Article MATH Google Scholar
Labrinidis, A., & Jagadish, H. V. (2012). Challenges and opportunities with big data.ACM-. Proceedings of the VLDB Endowment, 5(12), 2032–2033.
Article Google Scholar
Lee, Y. (2013). Toward scalable internet traffic measurement and analysis with Hadoop. ACM SIGCOMM Computer Communication, 43(1), 5–13.
Article Google Scholar
Mahmud, J., Nichols, J., & Drews, C. (2014). Home location identification of twitter users. arXiv:1403.2345.
Ryan, T., & Lee, Y. C. (2015). Multi-tier resource allocation for data-intensive computing. Big Data Research, 2(3), 110–116.
Article Google Scholar
Samanta, S., Acharjee, S., Mukherjee, A., Das, D., & Dey, N. (2013). Ant Weight Lifting algorithm for image segmentation. In IEEE International Conference on Computational Intelligence and Computing Research (pp. 1–5).
Google Scholar
Shabeera, T. P., & Madhu Kumar, S. D. (2015). Optimizing virtual machine allocation in MapReduce cloud for improved data locality. International Journal of Big Data Intelligence, 2(1), 2–8.
Article Google Scholar
Srivastava, U., & Gopalkrishnan, S. (2015). Impact of Big Data analytics on banking sector: Learning for Indian Bank. Big Data, Cloud and Computing Challenges, 50, 643–652.
Google Scholar
Terry, M. (2009). Twittering healthcare: Social media and medicine. Telemedicine and e-Health, 15(6), 507–510.
Article Google Scholar
Tiwari, P. K., & Joshi, S. (2015). Data security for software as a service. International Journal of Service Science, Management, Engineering, and Technology, 6(3), 47–63.
Article Google Scholar
Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM, 10, 178–185.
Google Scholar
Wahi, A. K., Medury, Y., & Misra, R. K. (2014). Social Media: The core of enterprise 2.0. International Journal of Service Science, Management, Engineering, and Technology, 5(3), 1–15.
Article Google Scholar
Wahi, A. K., Medury, Y., & Misra, R. K. (2015). Big Data: Enabler or Challenge for Enterprise 2.0. International Journal of Service Science, Management, Engineering, and Technology, 6(2), 1–17.
Article MathSciNet Google Scholar
Zhang, W., & Gelernter, J. (2014). Geocoding location expressions in twitter messages: A preference learning method. Journal of Spatial Information Science, 2014(9), 37–70.
Google Scholar
Zikopoulos, P., & Eaton, C. (2011). Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media.
Google Scholar

Download references

Acknowledgements

We would like to thank Ambedkar Institute of Advanced Communication Technologies and Research for providing us the infrastructure for carrying out the research work efficiently.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Ambedkar Institute of Advanced Communication Technologies and Research, Geeta Colony, New Delhi, 110031, India
Arushi Jain & Vishal Bhatnagar

Authors

Arushi Jain
View author publications
You can also search for this author in PubMed Google Scholar
Vishal Bhatnagar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arushi Jain .

Editor information

Editors and Affiliations

Department of Information Technology, Techno India College of Technology, Kolkata, West Bengal, India
Nilanjan Dey
Department of Information Systems, ISCAP, Porto Polytechnic, Porto, Portugal
Rosalina Babo
Department of Electronics and Electrical Communications Engineering, Faculty of Engineering, Tanta University, Tanta, Egypt
Amira S. Ashour
Department of Computer Science and Engineering, Ambedkar Institute of Advanced Communication Technologies and Research, New Delhi, India
Vishal Bhatnagar
Research Lab Sciences and Technologies of Image and Telecommunication, Sfax University, Sfax, Tunisia
Med Salim Bouhlel

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jain, A., Bhatnagar, V. (2018). Hashtag# Perspicacity of India Region Using Scalable Big Data Infrastructure Using Hadoop Environment. In: Dey, N., Babo, R., Ashour, A., Bhatnagar, V., Bouhlel, M. (eds) Social Networks Science: Design, Implementation, Security, and Challenges . Springer, Cham. https://doi.org/10.1007/978-3-319-90059-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-90059-9_4
Published: 19 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90058-2
Online ISBN: 978-3-319-90059-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics