Skip to main content

Women’s Forums on the Dark Web

  • Chapter
  • First Online:
Dark Web

Part of the book series: Integrated Series in Information Systems ((ISIS,volume 30))

Abstract

With the recent advent of Web 2.0, more and more women participate in and exchange opinions through community-based social media on the Internet. Questions concerning gender differences in the context of online communication have been raised. In this study, we develop a feature-based text classification framework to examine the online gender differences between female and male posters on web forums by analyzing writing styles and topics of interests. We examine the performance of different feature sets in an experiment involving political opinions. The results of our experimental study on this Islamic women’s political forum show that the feature sets containing both content-free and content-specific features perform significantly better than those consisting of only content-free features. In addition, feature subset selection can improve the classification results significantly. Female and male participants were found to have significantly different topics of interest in our study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Abbasi, A. and H. Chen, “Applying authorship analysis to extremist-group Web forum messages,” IEEE Intelligent Systems, vol. 20, no. 5 (Special issue on artificial intelligence for national and homeland security), 2005, pp. 67–75.

    Article  Google Scholar 

  • Abbasi, A. and H. Chen, “Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace,” ACM Transactions on Information Systems, vol. 26, no. 2, 2008, pp. 1–29.

    Article  Google Scholar 

  • Abbasi, H. Chen, and J.F. Nunamaker, “Stylometric identification in electronic markets: scalability and robustness,” Journal of Management Information Systems, vol. 25, no. 1, 2008b, pp. 49–78.

    Article  Google Scholar 

  • Abbasi, A., H. Chen, and A. Salem, “Sentiment analysis in multiple languages: feature selection for opinion classification in Web forums,” ACM Transactions on Information Systems, vol. 26, no. 3, 2008a, pp. 1–34.

    Article  Google Scholar 

  • Argamon, S., M. Koppel, and G. Avneri, “Routing documents according to style.,” in Proceedings of Proceedings of the 1st International Workshop on Innovative Information, Pisa, Italy, 1988.

    Google Scholar 

  • Argamon, S., M. Koppel, J. Fine, and A. Shimoni, “Gender, genre, and writing style in formal written texts,” Text, vol. 23, no. 3, 2003a, pp. 321–346.

    Article  Google Scholar 

  • Argamon, S., M. Saric, and S.S. Stein, “Style mining of electronic messages for multiple authorship discrimination,” in Proceedings of Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003b, pp. 475–480.

    Google Scholar 

  • Baayen, R.H., H.V. Halteren, A. Neijt, and F.J. Tweedie, “An experiment in authorship attribution,” in Proceedings of Proceedings of the 6th International Conference on Statistical Analysis of Textual Data, 2002, pp. 69–75.

    Google Scholar 

  • Baayen, R.H., H.V. Halteren, and F.J. Tweedie, “Outside the cave of shadows: using syntactic annotation to enhance authorship attribution,” Literary and Linguistic Computing, vol. 11, no. 3, 1996, pp. 121–132.

    Article  Google Scholar 

  • Bimber, B., “Measuring the gender gap on the Internet,” Social Science Quarterly, vol. 81, no. 3, 2000, pp. 868–876.

    Google Scholar 

  • Burrows, J.F., “‘An ocean where each kind….’ Statistical analysis and some major determinants of literary style,” Computers and the Humanities, vol. 23, no. 4–5, 1989, pp. 309–321.

    Article  Google Scholar 

  • CommerceNet, “The CommerceNet/Nielsen Internet demographic survey (1999),” http://www.commerce.net/, 1999.

  • Consaluo, M. and S. Paasonen, Women and Everyday Uses of the Internet: Agency and Identity, New York: Peter Lang Publishing, 2002.

    Google Scholar 

  • Corney, M., O. de Vel, A. Anderson, and G. Mohay, “Gender-preferential text mining of e-mail discourse,” in Proceedings of Proceedings of the 18th Annual Computer Security Applications Conference (ACSAC 2002), Las Vegas, 2002, pp. 282–292.

    Google Scholar 

  • Dave, K., S. Lawrence, and D. Pennock, “Mining the peanut gallery: opinion extraction and semantic classification of product reviews,” in Proceedings of Proceedings of the 12th International World Wide Web Conference (WWW’03), 2003, pp. 519–528.

    Google Scholar 

  • de Vel, O., “Mining E-mail Authorship,” in Proceedings of Paper presented at the Workshop on Text Mining, ACM International Conference on Knowledge Discovery and Data Mining (KDD 2000), Boston, MA, 2000.

    Google Scholar 

  • de Vel, O., A. Anderson, M. Corney, and G. Mohay, “Mining e-mail content for author identification forensics,” SIGMOD Record, vol. 30, no. 4, 2001, pp. 55–64.

    Article  Google Scholar 

  • Diederich, J., J. Kindermann, E. Leopold, and G. Paass, “Authorship attribution with support ­vector machines,” Applied Intelligence, vol. 19, no. 1–2, 2003, pp. 109–123.

    Article  MATH  Google Scholar 

  • Forsyth, R.S., and D.I. Holmes, “Feature finding for text classification,” Literary and Linguistic Computing, vol. 11, no. 4, 1996, pp. 163–174.

    Article  Google Scholar 

  • Fountain, J.E., “Constructing the information society: women, information technology, and design,” Technology and Societyvol. 22, no. 1, 2000, pp. 45–62.

    Article  Google Scholar 

  • Fuller, J.E., “Equality in cyberdemocracy? Gauging gender gaps in on-line civic participation,” Social Science Quarterlyvol. 85, no. 4, 2004, pp. 938–957.

    Article  Google Scholar 

  • Gamon, M., “Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis,” in Proceedings of Proceedings of the 20th International Conference on Computational Linguistics, 2004, pp. 841–847.

    Google Scholar 

  • Grefenstette, G., Y. Qu, J.G. Shanahan, and D.A. Evans, “Coupling niche browsers and affect analysis for an opinion mining application,” in Proceedings of Proceedings of the 12th International Conference Recherche d’Information Assistee par Ordinateur, 2004, pp. 186–194.

    Google Scholar 

  • Guiller, J. and A. Durndell, “Students’ linguistic behaviour in online discussion groups: Does gender matter?” Computers in Human Behavior, vol. 23, no. 5, 2007, pp. 2240–55.

    Article  Google Scholar 

  • Guo, B. and M.S. Nixon, “Gait feature subset selection by mutual information,” IEEE Transactions on Systems, Man, and Cybernetics – Part A: Systems and Humans, vol. 39, no. 1, 2009, pp. 36–46.

    Article  Google Scholar 

  • Halbert, D. “Shulamith firestone: radical feminism and visions of the information society,” Information Communication and Society, vol. 7, no. 1, 2004, pp. 115–136.

    Article  Google Scholar 

  • Harcourt, W., “The personal and the political: women using the Internet,” Cyberpsychology and Behaviorvol. 3, no. 5, 2000, pp. 693–697.

    Article  Google Scholar 

  • Harp, D. and M. Tremayne, “The gendered blogosphere: examining inequality using network and feminist theory,” Journalism and Mass Communication Quarterlyvol. 83, no. 2, 2006, pp. 247–264.

    Article  Google Scholar 

  • Holmes, D.I. and R.S. Forsyth, “The federalist revisited: new directions in authorship attribution,” Literary and Linguistic Computing, vol. 10, no. 2, 1995, pp. 111–127.

    Article  Google Scholar 

  • Hota, S., S. Argamon, M. Koppel, and I. Zigdon, “Performing gender: automatic stylistic analysis of Shakespeare’s characters,” in Proceedings of Proceedings of the Digital Humanities Conference (Association for Computers in Humanities and the Association for Literary and Linguistic Computing), 2006, pp. 100–106.

    Google Scholar 

  • Hu, M. and B. Liu, “Mining and summarizing customer reviews,” in Proceedings of Proceedings of the ACM SIGKDD International Conference, 2004, pp. 168–177.

    Google Scholar 

  • Jackson, L.A., K.S. Ervin, P.D. Gardner, and N. Schmitt, “Gender and the Internet: women ­communicating and men searching,” Sex Roles: A Journal of Research, vol. 44, no. 5–6, 2001, pp. 363–378.

    Article  Google Scholar 

  • Koppel, M., N. Akiva, and I. Dagan, “Feature instability as a criterion for selecting potential style markers,” J. Amer. Soc. Inf. Sci. Technol, vol. 57, no. 11, 2006, pp. 1519–1525.

    Article  Google Scholar 

  • Koppel, M., S. Argamon, and A. Shimoni, “Automatically categorizing written texts by author gender,” Literary and Linguistic Computing, vol. 14, no. 7, 2002, pp. 401–412.

    Article  Google Scholar 

  • Koppel, M. and J. Schler, “Exploiting stylistic idiosyncrasies for authorship attribution,” in Proceedings of Proceedings of the IJCAIWorkshop on Computational Approaches to Style Analysis and Synthesis, Acapulco, Mexico, 2003.

    Google Scholar 

  • Ledger G.R. and T.V.N. Merriam, “Shakespeare, Fletcher, and the two noble kinsmen.,” Literary and Linguistic Computing, vol. 9, no. 4, 1994, pp. 235–248.

    Article  Google Scholar 

  • Li, J., Z. Zhang, X. Li, and H. Chen, “Kernel-based learning for biomedical relation extraction,” Journal of the American Society for Information Science and Technology (JASIST), vol. 59, no. 5, 2008, pp. 756–769.

    Article  Google Scholar 

  • Li, J., R. Zheng, and H. Chen, “From fingerprint to Writeprint,” Communications of the ACM, vol. 49, no. 4, 2006, pp. 76–82.

    Article  Google Scholar 

  • Martindale, C. and D. McKenzie, “On the utility of content analysis in author attribution: the ­federalist,” Comput. Humanit., vol. 29, no. 4, 1995, pp. 259–270.

    Article  Google Scholar 

  • Mendenhall, T.C. “The characteristic curves of composition,” Science, vol. 11, no. 11, 1887, pp. 237–249.

    Article  Google Scholar 

  • Mishne, G., “Experiments with mood classification,” in Proceedings of Proceedings of the 1st Workshop on Stylistic Analysis of Text for Information Access, Salvador, Brazil, 2005.

    Google Scholar 

  • Mitra, A., “Voices of the marginalized on the Internet: examples from a Website for women of South Asia,” Journal of Communicationvol. 54, no. 3, 2004, pp. 492–510.

    Article  MathSciNet  Google Scholar 

  • Mosteller, F., Applied Bayesian and Classical Inference: The Case of the Federalist Papers, 2nd ed., Springer, 1964.

    Google Scholar 

  • National Election Study, “American National Election Study. 1998 Pre- and post- election survey,” Conducted by the Center for Political Studies of the Institute for Social Research, The University of Michigan, Ann Arbor, Inter-University Consortium for Political and Social Research, 1998.

    Google Scholar 

  • Nowson, S. and J. Oberlander, “The identity of bloggers: openness and gender in personal Weblogs,” in Proceedings of Proceedings of the AAAI Spring Symposia on Computational Approaches to Analyzing Weblogs, Stanford, California, 2006.

    Google Scholar 

  • O’Reilly, T. “What Is Web 2.0? Design patterns and business models for the next generation of software,” http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-Web-20.html, 2005.

  • Ogan, C., F. Cicek, and M. Ozakca, “Letters to Sarah: analysis of email responses to an online editorial,” New Media and Societyvol. 7, no. 4, 2005, pp. 533–557.

    Article  Google Scholar 

  • Pang, B., L. Lee, and S. Vaithyanathain, “Thumbs up? Sentiment classification using machine learning techniques,” in Proceedings of Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2002, pp. 79–86.

    Google Scholar 

  • Peng, F., D. Schuurmans, V. Keselj, and S. Wang, “Automated authorship attribution with character level language models,” in Proceedings of Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, 2003.

    Google Scholar 

  • Pew Internet and American Life Project, http://www.pewinternet.org/trends/User_Demo_7.22.08.htm, 2008.

  • Platt, J. Fast Training on SVMs Using Sequential Minimal Optimization, In Scholkopf, B., Burges, C., and Smola, A. (Ed.) ed., Advances in Kernel Methods: Support Vector Learning, Cambridge, MA: MIT Press, 1999.

    Google Scholar 

  • Quinlan, J.R., “Induction of decision trees,” Machine Learning, vol. 1, no. 1, 1986, pp. 81–106.

    Google Scholar 

  • Schler, J., M. Koppel, S. Argamon, and J. Pennebaker, “Effects of age and gender on blogging,” in Proceedings of Proceedings of AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs, Menlo Park, California, 2006, pp. 199–205.

    Google Scholar 

  • Seale, C., S. Ziebland, and J. Charteris-Black, “Gender, cancer experience and Internet use: a comparative keyword analysis of interviews and online cancer support groups,” Social Science and Medicine, vol. 62, no. 10, 2006, pp. 2577–2590.

    Article  Google Scholar 

  • Shade, L.R., Gender and Community in the Social Construction of the InternetGender and Community in the Social Construction of the Internet, New York: Peter Lang Publishing, 2002.

    Google Scholar 

  • Shannon, C.E., “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, no. 4, 1948, pp. 379–423.

    Article  MathSciNet  Google Scholar 

  • Sherman, A.P., Cybergrrl @ Work: Tips and Inspiration for the Professional You, Berkley Trade, 2001.

    Google Scholar 

  • Subasic, P. and A. Huettner, “Affect analysis of text using fuzzy semantic typing,” IEEE Transactions on Fuzzy Systems, vol. 9, no. 4, 2001, pp. 483–496.

    Article  Google Scholar 

  • Turney, P.D., “Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews,” in Proceedings of Proceedings of the 40th Annual Meetings of the Association for Computational Linguistics, Philadelphia, Pennsylvania, 2002, pp. 417–424.

    Google Scholar 

  • Tweedie, F.J. and R.H. Baayen, “How variable may a constant be? Measures of lexical richness in perspective.,” Computers and the Humanities, vol. 32, no. 5, 1998, pp. 323–352.

    Article  Google Scholar 

  • Wiebe, J., T. Wilson, and M. Bell, “Identifying collocations for recognizing opinions,” in Proceedings of Proceedings of the ACL/EACL Workshop on Collocation, Toulouse, France, 2001.

    Google Scholar 

  • Wiebe, J., T. Wilson, R. Bruce, M. Bell, and M. Martin, “Learning subjective language,” Computational Linguistics, vol. 30, no. 3, 2004, pp. 277–308.

    Article  Google Scholar 

  • Witten, I.H. and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques (2nd Edition), 2nd Edition ed., San Francisco: Morgan Kaufmann, 2005.

    MATH  Google Scholar 

  • Yang, Y. and J.O. Pedersen, “A comparative study on feature selection in text categorization,” in Proceedings of Proceedings of the ICML97, 1997, pp. 412–420.

    Google Scholar 

  • Youngs, G., “Cyberspace: the new feminist frontier,” in Karen Ross and Carolyn M. Byerly, ed., Women and Media: International PerspectivesWiley-Blackwell, 2004, pp. 185–208.

    Chapter  Google Scholar 

  • Yule, G.U., “On sentence length as a statistical characteristic of style in prose with application to two cases of disputed authorship,” Biometrika, vol. 30, 1938, pp. 363–390.

    Google Scholar 

  • Yule, G.U., The Statistical Study of Literary Vocabulary, Cambridge University Press, 1944.

    Google Scholar 

  • Zheng, R., J. Li, H. Chen, and Z. Huang, “A framework for authorship identification of online messages: writing-style features and classification techniques,” Journal of the American Society for Information Science and Technology (JASIST), vol. 57, no. 3, 2006, pp. 378–393.

    Article  Google Scholar 

Download references

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. CNS-0709338, “(CRI: CRD) Developing a Dark Web Collection and Infrastructure for Computational and Social Sciences.” We would also like to thank Dr. Katharina von Knop for her helpful suggestions and comments about our research test bed.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hsinchun Chen .

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Chen, H. (2012). Women’s Forums on the Dark Web. In: Dark Web. Integrated Series in Information Systems, vol 30. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1557-2_19

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1557-2_19

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-1556-5

  • Online ISBN: 978-1-4614-1557-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics