Implicit Group Membership Detection in Online Text: Analysis and Applications

Ellen, Jeffrey; Kaina, Joan; Parameswaran, Shibin

doi:10.1007/978-3-642-29047-3_27

Jeffrey Ellen¹⁹,
Joan Kaina¹⁹ &
Shibin Parameswaran¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7227))

Included in the following conference series:

International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction

3129 Accesses
1 Citations

Abstract

Our thesis is that members of the same group have shared tendencies and nuances in communication style and substance, particularly online. In this paper, we dicuss some potential applications of accuarate authorship affiliation technology. We also discuss related work in similar author identification efforts and the research issues that currently exist when trying to perform automated authorship affiliation. We provide quantitative results from our recent Machine Learning experimenation using Support Vector Machines as some initial validation of our theory. In this paper, we applied our work towards the task of classifying website forum posts by the affiliation of their author. We discuss in detail the stylometric features we used to perform the automated classification and split the original features into individual groups to isolate their respective contributions and/or discriminating capability. Our results show promise towards automating group representation, an important first step in studying group formation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Giles, H., Taylor, D., Bourhis, R.: Towards a theory of interpersonal accomodation through language. In: Language in Society, vol. 2, pp. 177–192. Cambridge University Press (1973)
Google Scholar
Postmes, T., Spears, R., Lea, M.: The Formation of Group Norms in Computer-Mediated Communication. In: Human Communication Research, vol. 26, pp. 341–371. Sage Publications (2000)
Google Scholar
Ceruti, M.G., McGirr, S.C., Kaina, J.L.: Interaction of Language, Culture and Cognition in Group Dynamics for Understanding the Adversary. In: Proceedings of the National Symposium on Sensor and Data Fusion (NSSDF, Nellis AFB, Las Vegas, NV (2010)
Google Scholar
Holmes, D.I.: Authorship Attribution. Computers and the Humanities 28(2), 87–106 (1994)
Article Google Scholar
Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology 57(3), 378–393 (2006)
Article Google Scholar
Stamatatos, E.: A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology, 538–556 (2009)
Google Scholar
Abbasi, A., Chen, H.: Applying Authorship Analysis to Extremist-Group Web Forum Messages. IEEE Intelligent Systems, 67–75 (2005)
Google Scholar
Juola, P.: Authorship attribution. Foundations and Trends in Information Retrieval 1(3), 233–334 (2006)
Article Google Scholar
Booker, L., Strong, G.: Using Topic Analysis to Compute Identity Group Attributes. In: Social Computing, Behavioral Modeling, and Prediction, pp. 249–258 (2008)
Google Scholar
Koppel, M., Argamon, S., Shimoni, A.: Automatically Categorizing Written Texts by Author Gender. Literary and Linguistic Computing 17(3) (2002)
Google Scholar
Izumi, M., Miura, T., Shioya, I.: Estimating the date of blog authors by CRF. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, pp. 249–252 (2007)
Google Scholar
Goswami, S., Sarkar, S., Rustagi, M.: Stylometric Analysis of Bloggers’ Age and Gender. In: Proceedings of the AAAI International Conference on Weblogs and Social Media (2009)
Google Scholar
Koppel, M., Schler, J., Zigdon, K.: Determining an author’s native language by mining a text for errors. In: Proceedings of Knowledge Discovery in Data Mining, pp. 624–628 (2005)
Google Scholar
Argamon, S., Saric, M., Stein, S.S.: Style mining of electronic messages for multiple authorship discrimination: first results. In: Proceedings of Knowledge Discovery in Data Mining, pp. 475–480 (2003)
Google Scholar
Ratnaparkhi, A.: A Maximum Entropy Model for Part-Of-Speech Tagging. In: Proceedings of the Emperical Methods in Natural Language Processing, pp. 133–142 (1996)
Google Scholar
Khosmood, F., Levinson, R.: Automatic Synonym and Phrase Replacement Show Promise for Style Transformation. In: Proceedings of the IEEE Ninth International Conference on Machine Learning and Applications, pp. 958–961 (2010)
Google Scholar
Lin, W.H., Wilson, T., Wiebe, J., Hauptmann, A.: Which side are you on? Identifying perspectives at the document and sentence levels. In: Proceedings of the Tenth Conference on Natural Language Learning, pp. 109–116 (2006)
Google Scholar
Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405, 442–451 (1975)
Google Scholar
Burrows, J.F.: Word patterns and story shapes: The statistical analysis of narrative style. Literary and Linguistic Computing 2, 61–70 (1987)
Article Google Scholar
Stamatatos, E., Fakotakis, N., Kokkinakis, G.K.: Automatic Text Categorization in Terms of Genre, Author. Computational Linguist 26(4), 471–495 (2000)
Article Google Scholar
Argamon-Engelson, S., Koppel, M., Avneri, G.: Style-based text categorization: What newspaper am I reading? In: Proceedings of AAAI Workshop on Learning for Text Categorization, pp. 1–4 (1998)
Google Scholar
Ellen, J., Parameswaran, S.: Machine Learning for Author Affiliation within Web Forums. In: Proceedings of the IEEE Tenth International Conference on Machine Learning, pp. 100–106 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Space and Naval Warfare Systems Center Pacific, United States Navy, San Diego, CA, USA
Jeffrey Ellen, Joan Kaina & Shibin Parameswaran

Authors

Jeffrey Ellen
View author publications
You can also search for this author in PubMed Google Scholar
Joan Kaina
View author publications
You can also search for this author in PubMed Google Scholar
Shibin Parameswaran
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Engineering, Rochester Institute of Technology, 14623, Rochester, New York, USA
Shanchieh Jay Yang
Applied Physics Laboratory, Research and Exploratory Development Department, Johns Hopkings University, 20723, Laurel, MD, USA
Ariel M. Greenberg
SA Technologies, 3750 Paladian Viillage Drive, Building 600, 30066, Marietta, GA, USA
Mica Endsley

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ellen, J., Kaina, J., Parameswaran, S. (2012). Implicit Group Membership Detection in Online Text: Analysis and Applications. In: Yang, S.J., Greenberg, A.M., Endsley, M. (eds) Social Computing, Behavioral - Cultural Modeling and Prediction. SBP 2012. Lecture Notes in Computer Science, vol 7227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29047-3_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-29047-3_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29046-6
Online ISBN: 978-3-642-29047-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics