Skip to main content

A Comparative Approach for Classification and Combined Cluster Based Classification Method for Tweets Data Analysis

  • Conference paper
  • First Online:
Smart Intelligent Computing and Applications

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 160))

Abstract

The paper presents a problem in identifying duplicate detection of tweets from a social network or twitter. To accomplish the screening, we use clustering process for text-based classification, and this has to be applied not for test set but also employed on the training dataset. To incorporate the knowledge of discovery and performance, we use classifier and combined clustering with classification process in the detection of spam tweets or duplicate tweets on social media. We have performed experiments on TSVM/SVM and C-SVM classification approaches on tweets and have demonstrated the efficiency related to our approach. The texts performed on the integrated cluster classification will perform better than SVM (Support Vector Machine) classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

SVM:

Support Vector Machine

nu-SVM:

Regression SVM type 2

URL:

Uniform resource Locator

CK:

Chidambaram

TSVM:

Transductive support vector machines (TSVM)

C-SVM:

Classification SVM type 1

References

  1. Xu, J.-M., Jun, K.-S., Zhu, X., Bellmore, A.: Learning from bullying traces in social media. In: HLT-NAACL, pp. 656–666. The Association for Computational Linguistics (2012)

    Google Scholar 

  2. Jansen, B.J., Zhang, M., Sobel, K., Chowdury, A.: Twitter power: tweets as electronic word of mouth. J. Am. Soc. Inf. Sci. Technol. 60(11), 2169–2188 (2009)

    Article  Google Scholar 

  3. Ritter, A., Clark, S., Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP’11, pp. 1524–1534. Association for Computational Linguistics, Stroudsburg, PA, USA (2011)

    Google Scholar 

  4. Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’09, pp. 1275–1284. ACM, New York, NY (2009)

    Google Scholar 

  5. Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ser. ACL 02, pp. 417–424. Association for Computational Linguistics, Stroudsburg, PA, USA (2002)

    Google Scholar 

  6. Padhy, N., Panigrahi, R., Satapathy, S.C.: Identifying the reusable components from component-based system: proposed metrics and model. Information Systems Design and Intelligent Applications, Advances in Intelligent Systems and Computing 863 (2019). https://doi.org/10.1007/978-981-13-3338-5_9

    Google Scholar 

  7. Padhy, N., Singh, R.P., Satapathy, S.: Cost-effective and fault-resilient reusability prediction model by using adaptive genetic algorithms based neural network for web of service application. Clust. Comput. (2018). https://doi.org/10.1007/s10586-018-2359-9 (Springer)

    Article  Google Scholar 

  8. Padhy, N., Satapathy, S., Singh, R.P.: State-of-the-Art object-oriented metrics and its reusability: a decade review. In: Satapathy, S., Bhateja, V., Das, S. (eds.) Smart Computing and Informatics. Smart Innovation, Systems and Technologies, vol. 77. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-5544-7_42

    Google Scholar 

  9. Cheong, M., Lee, V.C.: A microblogging-based approach to terrorism informatics: exploration and chronicling civilian sentiment and response to terrorism events via Twitter. Inf. Syst. Front. 13(1), 45–59 (2011)

    Article  Google Scholar 

  10. Diakopoulos, N.A., Shamma, D.A.: Characterizing debate performance via aggregated Twitter sentiment. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’10, pp. 1195–1198. ACM, New York, NY, USA (2010)

    Google Scholar 

  11. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’04, pp. 168–177. ACM, New York, NY, USA (2004)

    Google Scholar 

  12. Mining opinion features in customer reviews. In: Proceedings of the 19th National Conference on Artificial Intelligence, ser. AAAI’04, pp. 755–760. AAAI Press (2004)

    Google Scholar 

  13. He, B., Macdonald, C., He, J., Ounis, I.: An effective statistical approach to blog post opinion retrieval. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, ser. CIKM ’08, pp. 1063–1072. ACM, New York, NY, USA (2008)

    Google Scholar 

  14. Balahur, A., Steinberger, R., Kabadjov, M., Zavarella, V., van derGoot, E., Halkia, M., Pouliquen, B., Belyaeva, J.: Sentiment analysis in the news. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10, N. C. C. Chair), K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner

    Google Scholar 

  15. Padhy, N., Singh, R,P., Satapathy, S.C.: Software reusability metrics estimation: algorithms, models and optimization techniques. Comput. Electr. Eng. 69, 653–668 (2018). https://doi.org/10.1016/j.compeleceng.2017.11.022 (Elsevier)

    Article  Google Scholar 

  16. Padhy, N., Satapathy, S., Singh, R.: Utility of an object oriented reusability metrics and estimation complexity. Indian J. Sci. Technol. 10(3) (2017). https://doi.org/10.17485/ijst/2017/v10i3/107289

  17. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of EMNLP, pp. 79–86 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Laxmi Narasamma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Laxmi Narasamma, V., Sreedevi, M. (2020). A Comparative Approach for Classification and Combined Cluster Based Classification Method for Tweets Data Analysis. In: Satapathy, S., Bhateja, V., Mohanty, J., Udgata, S. (eds) Smart Intelligent Computing and Applications . Smart Innovation, Systems and Technologies, vol 160. Springer, Singapore. https://doi.org/10.1007/978-981-32-9690-9_33

Download citation

Publish with us

Policies and ethics