Abstract
The paper presents a problem in identifying duplicate detection of tweets from a social network or twitter. To accomplish the screening, we use clustering process for text-based classification, and this has to be applied not for test set but also employed on the training dataset. To incorporate the knowledge of discovery and performance, we use classifier and combined clustering with classification process in the detection of spam tweets or duplicate tweets on social media. We have performed experiments on TSVM/SVM and C-SVM classification approaches on tweets and have demonstrated the efficiency related to our approach. The texts performed on the integrated cluster classification will perform better than SVM (Support Vector Machine) classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- SVM:
-
Support Vector Machine
- nu-SVM:
-
Regression SVM type 2
- URL:
-
Uniform resource Locator
- CK:
-
Chidambaram
- TSVM:
-
Transductive support vector machines (TSVM)
- C-SVM:
-
Classification SVM type 1
References
Xu, J.-M., Jun, K.-S., Zhu, X., Bellmore, A.: Learning from bullying traces in social media. In: HLT-NAACL, pp. 656–666. The Association for Computational Linguistics (2012)
Jansen, B.J., Zhang, M., Sobel, K., Chowdury, A.: Twitter power: tweets as electronic word of mouth. J. Am. Soc. Inf. Sci. Technol. 60(11), 2169–2188 (2009)
Ritter, A., Clark, S., Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP’11, pp. 1524–1534. Association for Computational Linguistics, Stroudsburg, PA, USA (2011)
Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’09, pp. 1275–1284. ACM, New York, NY (2009)
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ser. ACL 02, pp. 417–424. Association for Computational Linguistics, Stroudsburg, PA, USA (2002)
Padhy, N., Panigrahi, R., Satapathy, S.C.: Identifying the reusable components from component-based system: proposed metrics and model. Information Systems Design and Intelligent Applications, Advances in Intelligent Systems and Computing 863 (2019). https://doi.org/10.1007/978-981-13-3338-5_9
Padhy, N., Singh, R.P., Satapathy, S.: Cost-effective and fault-resilient reusability prediction model by using adaptive genetic algorithms based neural network for web of service application. Clust. Comput. (2018). https://doi.org/10.1007/s10586-018-2359-9 (Springer)
Padhy, N., Satapathy, S., Singh, R.P.: State-of-the-Art object-oriented metrics and its reusability: a decade review. In: Satapathy, S., Bhateja, V., Das, S. (eds.) Smart Computing and Informatics. Smart Innovation, Systems and Technologies, vol. 77. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-5544-7_42
Cheong, M., Lee, V.C.: A microblogging-based approach to terrorism informatics: exploration and chronicling civilian sentiment and response to terrorism events via Twitter. Inf. Syst. Front. 13(1), 45–59 (2011)
Diakopoulos, N.A., Shamma, D.A.: Characterizing debate performance via aggregated Twitter sentiment. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’10, pp. 1195–1198. ACM, New York, NY, USA (2010)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’04, pp. 168–177. ACM, New York, NY, USA (2004)
Mining opinion features in customer reviews. In: Proceedings of the 19th National Conference on Artificial Intelligence, ser. AAAI’04, pp. 755–760. AAAI Press (2004)
He, B., Macdonald, C., He, J., Ounis, I.: An effective statistical approach to blog post opinion retrieval. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, ser. CIKM ’08, pp. 1063–1072. ACM, New York, NY, USA (2008)
Balahur, A., Steinberger, R., Kabadjov, M., Zavarella, V., van derGoot, E., Halkia, M., Pouliquen, B., Belyaeva, J.: Sentiment analysis in the news. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10, N. C. C. Chair), K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner
Padhy, N., Singh, R,P., Satapathy, S.C.: Software reusability metrics estimation: algorithms, models and optimization techniques. Comput. Electr. Eng. 69, 653–668 (2018). https://doi.org/10.1016/j.compeleceng.2017.11.022 (Elsevier)
Padhy, N., Satapathy, S., Singh, R.: Utility of an object oriented reusability metrics and estimation complexity. Indian J. Sci. Technol. 10(3) (2017). https://doi.org/10.17485/ijst/2017/v10i3/107289
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of EMNLP, pp. 79–86 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Laxmi Narasamma, V., Sreedevi, M. (2020). A Comparative Approach for Classification and Combined Cluster Based Classification Method for Tweets Data Analysis. In: Satapathy, S., Bhateja, V., Mohanty, J., Udgata, S. (eds) Smart Intelligent Computing and Applications . Smart Innovation, Systems and Technologies, vol 160. Springer, Singapore. https://doi.org/10.1007/978-981-32-9690-9_33
Download citation
DOI: https://doi.org/10.1007/978-981-32-9690-9_33
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9689-3
Online ISBN: 978-981-32-9690-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)