Abstract
Social media is rapidly providing new standards of interaction between individuals in the recent era. It is known as a computer-intermediated tool that tolerates people to share, or create information, medical notes/reports, ideas, and pictures/videos through virtual communities. The online citizens (netizens) are able to create a colossal network of people to communicate with. Social media such as Facebook, Twitter, Youtube, Instagram and Tinder have prevalent uses that produce copious amount of data which is beyond the ability of normal software tools to process in the given elapsed time. Apache Hadoop project is the most famous open sourced frameworks for large scale computation on the commodity hardware. Hadoop has become kernel for distributed operating system for big data. There are two core components associated with Hadoop–Hadoop Distributed File System (HDFS) and MapReduce. MapReduce distributed the tasks on multiple nodes in the cluster, the developer only have to write code rest is taken care by MapReduce. The generated data from these social sources is real time and includes information about author’s daily activities, feelings and emotions. The messages often include images, geo-locations and many other annotations. This vast data repository provides researchers with opportunities to study the individuals’ behavior/emotions that subject to different conditions. In this chapter we will find the trending tweets from January 2017 to September 2017 depending upon the eight prominent themes that emerged from the data set and trending tweets depending upon the geolocation. For this we divide India depending upon the region that is North India, West India, South India, East India, Central India, Northeast India
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Jarrah, O. Y., Yoo, P. D., Muhaidat, S., Karagiannidis, G. K., & Taha, K. (2015). Efficient machine learning for Big data: A review. Big Data Research, 2(3), 87–93.
Aye, K. N., & Thein, T. (2015). A platform for big data analytics on distributed scale-out storage system. International Journal of Big Data, Intelligence, 2(2), 127–141.
Barker, K. J., Amato, J., & Sheridon, J. (2008). Credit card fraud: Awareness and prevention. Journal of Financial Crime, 15(4), 398–410.
Bhanu, S. K., & Tripathy, B. K. (2016). Rough set based similarity measures for data analytics in spatial epidemiology. International Journal of Rough Seta and Data Analysis, 3(1), 114–123.
Carr, C. T., & Hayes, R. A. (2015). Social media: defining, developing, and divining. Atlantic Journal of Communication, 23(1), 46–65.
Chandra, S., Khan, L., Muhaya, F. B. (2011). Estimating twitter user location using social interactions–a content based approach. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT), and 2011 IEEE Third International Conference on Social Computing (SOCIALCOM) (pp. 838–843). IEEE.
Chen, M., Mao, S., & Liu, Y. (2009). Big Data: A survey. Springer-Mobile Networks and Applications, 19(2), 171–209.
Chen, H., Chiang, R., & Storey, V. (2012). Busines intelligence and analytics: From Big Data to big impact. MIS Quaterly, 36(4), 1–10.
Chen, X., Vorvoreanu, M., & Madhavan, K. (2014). Mining social media data for understanding student’s learning experience. IEEE Transaction on Learning Technologies, 7(3), 246–259.
Cheng, Z., Caverlee, J., & Lee, K. (2010). You are where you tweet: a content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (pp. 759–768). ACM.
Davis Jr, C. A., Pappa, G. L., de Oliveira, D. R. R., & de L Arcanjo, F. (2011). Inferring the location of twitter messages based on user relationships. Transactions in GIS, 15(6), 735–751.
Deepak, D., & John, S. J. (2016). Information systems on hesitant fuzzy sets. International Journal of Rough Seta and Data Analysis, 3(1), 55–70.
Fedoryszak, M.,Tkaczyk, D. and Bolikowski,L. (2013). Large scale citation matching using Apache Hadoop. Springer-Research and Advanced Technology for Digital Libraries Lecture Notes in Computer Science (Vol. 8092, pp. 362–365).
Gelernter, J., & Mushegian, N. (2011). Geo-parsing messages from microtext. Transactions in GIS, 15(6), 753–773.
González-Vélez, H., & Kontagora, M. (2011). Performance evaluation of MapReduce using full virtualisation on a departmental cloud. International Journal of Applied Mathematics and Computer Science, 21(2), 275–284.
Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information System, 47, 98–115.
Hassanien, A. E., Azar, A. T., Snasel, V., Kacprzyk, J., & Abawajy, J. H. (2015). Big Data in complex systems: Challenges and opportunities. Studies in Big Data (Vol. 9). Berlin/Heidelberg: Verlag GmbH, Springer.
Hays, R., & Daker-White, G. (2015). The care.data consensus? A qualitative analysis of opinions expressed on Twitter. BMC Public Health, 15(1), 1.
Hecht, B., Hong, L., Suh, B., & Chi, E. H. (2011). Tweets from justin bieber’s heart: The dynamics of the location field in user profiles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 237–246). ACM.
Hoffmann, A. O. I., & Birnbrich, C. (2012). The impact of fraud prevention on bank-customer relationships. International Journal of Bank Marketing., 30(5), 390–407.
Huang, T., Lan, L., Fang, X., An, P., Min, J., & Wang, F. (2015). Promises and challenges of big data computing in health science. Big Data Research, 2(1), 2–11.
Ibrahim, S., Jin, H., Lu, L., Qi, L., Wu, S., & Shi, X. (2009). Evaluating MapReduce on virtual machines: The Hadoop Case. Springer: Cloud Computing. Lecture Notes in Computer Science (Vol. 5931, pp. 519–528).
Jacobs, A. (2009). The pathologies of big data. Communications of the ACM—A Blind Person’s Interaction with Technology, 52(8), 36–44.
Jagadish, H. V. (2015). Big Data and science: Myths and reality. Big Data Research, 2(2), 49–52.
Jin, X., Wah, B. W., Cheng, X., & Wang, Y. (2015). Significance and challenges of big data research. Big Data Research, 2(2), 59–64.
Khanwalkar, S., Seldin, M., Srivastava, A., Kumar, A., & Colbath, S. (2013, September). Content-based geo-location detection for placing tweets pertaining to trending news on map. In The Fourth International Workshop on Mining Ubiquitous and Social Environments (p. 37).
Kolomvatsos, K., Anagnostopoulos, C., & Hadjiefthymiades, S. (2015). An efficient time optimized scheme for progressive analytics in Big Data. Big Data Research, 2(4), 155–165.
Labrinidis, A., & Jagadish, H. V. (2012). Challenges and opportunities with big data.ACM-. Proceedings of the VLDB Endowment, 5(12), 2032–2033.
Lee, Y. (2013). Toward scalable internet traffic measurement and analysis with Hadoop. ACM SIGCOMM Computer Communication, 43(1), 5–13.
Mahmud, J., Nichols, J., & Drews, C. (2014). Home location identification of twitter users. arXiv:1403.2345.
Ryan, T., & Lee, Y. C. (2015). Multi-tier resource allocation for data-intensive computing. Big Data Research, 2(3), 110–116.
Samanta, S., Acharjee, S., Mukherjee, A., Das, D., & Dey, N. (2013). Ant Weight Lifting algorithm for image segmentation. In IEEE International Conference on Computational Intelligence and Computing Research (pp. 1–5).
Shabeera, T. P., & Madhu Kumar, S. D. (2015). Optimizing virtual machine allocation in MapReduce cloud for improved data locality. International Journal of Big Data Intelligence, 2(1), 2–8.
Srivastava, U., & Gopalkrishnan, S. (2015). Impact of Big Data analytics on banking sector: Learning for Indian Bank. Big Data, Cloud and Computing Challenges, 50, 643–652.
Terry, M. (2009). Twittering healthcare: Social media and medicine. Telemedicine and e-Health, 15(6), 507–510.
Tiwari, P. K., & Joshi, S. (2015). Data security for software as a service. International Journal of Service Science, Management, Engineering, and Technology, 6(3), 47–63.
Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM, 10, 178–185.
Wahi, A. K., Medury, Y., & Misra, R. K. (2014). Social Media: The core of enterprise 2.0. International Journal of Service Science, Management, Engineering, and Technology, 5(3), 1–15.
Wahi, A. K., Medury, Y., & Misra, R. K. (2015). Big Data: Enabler or Challenge for Enterprise 2.0. International Journal of Service Science, Management, Engineering, and Technology, 6(2), 1–17.
Zhang, W., & Gelernter, J. (2014). Geocoding location expressions in twitter messages: A preference learning method. Journal of Spatial Information Science, 2014(9), 37–70.
Zikopoulos, P., & Eaton, C. (2011). Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media.
Acknowledgements
We would like to thank Ambedkar Institute of Advanced Communication Technologies and Research for providing us the infrastructure for carrying out the research work efficiently.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Jain, A., Bhatnagar, V. (2018). Hashtag# Perspicacity of India Region Using Scalable Big Data Infrastructure Using Hadoop Environment. In: Dey, N., Babo, R., Ashour, A., Bhatnagar, V., Bouhlel, M. (eds) Social Networks Science: Design, Implementation, Security, and Challenges . Springer, Cham. https://doi.org/10.1007/978-3-319-90059-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-90059-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90058-2
Online ISBN: 978-3-319-90059-9
eBook Packages: Computer ScienceComputer Science (R0)