Dark Web pp 105-126 | Cite as

Interactional Coherence Analysis

  • Hsinchun ChenEmail author
Part of the Integrated Series in Information Systems book series (ISIS, volume 30)


Despite the rapid growth of text-based computer-mediated communication (CMC), its limitations have rendered the media highly incoherent. The lack of coherence in CMC poses problems for content analysis of online discourse archives. Interactional coherence analysis (ICA) attempts to accurately identify and construct interaction networks of CMC messages. Although significant progress has been made, ICA research still has several limitations. Most previous ICA approaches used either system or linguistic features, but not both in conjunction, and also failed to address noise issues such as typos, misspellings, and idiosyncratic system usage behavior. Moreover, Web forums have seldom been studied for interactional coherence in spite of their prevalence. In this study, we propose the Hybrid Interactional Coherence (HIC) algorithm for identification of Web forum interaction. HIC utilizes both system features, such as header information and quotations, and linguistic features, such as direct address and lexical relation. Furthermore, several similarity-based methods, including a Lexical Match Algorithm (LMA) and a sliding window method, are utilized to account for interactional idiosyncrasies. Experiments were conducted on a large domestic extremist Web forum to compare the algorithm with traditional linkage and similarity-based methods. HIC significantly outperformed both comparison techniques in terms of precision, recall, and F-measure at both the forum and thread levels. The results demonstrate the effectiveness of HIC for identifying Web forum interaction.


Linguistic Feature Direct Address Head Information Slide Window Method Interactional Coherence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This research was funded in part by the following grant: NSF Information and Data Management, “SGER: Multilingual Online Stylometric Authorship Identification: An Exploratory Study,” August 2006–August 2007.


  1. Abbasi, A. and Chen, H. (2006). Visualizing Authorship for Identification. In the 4th IEEE Symposium on Intelligence and Security Informatics (ISI 2006).CrossRefGoogle Scholar
  2. Adamson, G. W. and Boreham, J. (1974). The Use of an Association Measure Based on Character Structure to Identify Semantically Related Pairs of Words and Document Titles. Information Storage and Retrieval, (10), 253–260.CrossRefzbMATHGoogle Scholar
  3. Bagga, A., and Baldwin, B. (1998). Entity-Based Cross-Document Coreferencing Using the Vector Space Model. In Proceedings of the 17th International Conference on Computational Linguistics.Google Scholar
  4. Barcellini, F., Detienne, F., Burkhardt, J. and Sack, W. (2005). A Study of Online Discussions in an Open-Source Software Community: Reconstructing Thematic Coherence and Argumentation from Quotation Practices. In Proceedings of the Communities and Technologies Conference.Google Scholar
  5. Barzilay, R. and Elhadad, M. (1997). Using Lexical Chains for Text Summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, 10–17.Google Scholar
  6. Beaugrande, R.A. and Dressler, W.U. (1996) Introduction to Text Linguistics. Longman., New York, 84–112.Google Scholar
  7. Burris, V., Smith, E., and Strahm, A. (2000). White Supremacist Networks on the Internet. Sociological Focus, (33:2), 215–235.CrossRefGoogle Scholar
  8. Chen, H. (2005). Introduction to the Special Topic Issue: Intelligence and Security Informatics. Journal of the American Society for Information Science and Technology, 56(3), 217–220.MathSciNetCrossRefGoogle Scholar
  9. Choi, F. Y. Y. (2000). Advances in Domain Independent Linear Text Segmentation. In Proceedings of the Meeting of the North American Chapter of the Association for Computational Linguistics (ANLP-NAACL-00), 26–33.Google Scholar
  10. Commer, D. and Peterson L. (1986). Conversation-based Mail. In TOCS 4(4), ACM Press, 299–319.Google Scholar
  11. De Roeck, A. N. and Al-Fares, W. (2000). A Morphologically Sensitive Clustering Algorithm for Identifying Arabic Roots. In Proceedings of ACL-2000 (ACL, 2000), Hong Kong.Google Scholar
  12. Donath, J., Karahalio, K. and Viegas, F. (1999). Visualizing Conversation. In Proceedings of the 32nd Conference on Computer-Human Interaction (CHI’ 02), Chicago, USA.Google Scholar
  13. Donath, J. (2002). A Semantic Approach to Visualizing Online Conversations. Communications of the ACM, 45(4), 45–49.CrossRefGoogle Scholar
  14. Eklundh, K. S. (1998). To Quote or Not to Quote: Setting the Context for Computer-Mediated Dialogues. Technical report TRITA-NA-P9807, IPLab-144, Royal Institute of Technology, Stockholm.Google Scholar
  15. Eklundh, K. S. and Rodriguez, H. (2004). Coherence and Interactivity in Text-based Group Discussions around Web Documents. In Proceedings of the 37th Annual Hawaii International Conference on System Sciences.Google Scholar
  16. Fiore, A. T., Tiernan, S. L., and Smith, M. A. (2002). Observed Behavior and Perceived Value of Authors in Usenet Newsgroups: Bridging the Gap. Proceeding of the SIGCHI Conference on Human Factors in Computing Systems: Changing Our World, Changing Ourselves, 323–330.Google Scholar
  17. Hale, C. (1996). Wired Style: Principles of English, Usage in the Digital Age. HardWired, San Francisco.Google Scholar
  18. Halliday, M. A. and Hasan, R. (1976). Cohesion in English. Longman, London.Google Scholar
  19. Hayne, S. C., Pollard, C. E., and Rice, R. E. (2003). Identification of Comment Authorship in Anonymous Group Support Systems. Journal of Management Information Systems, Volume 20, Number 1, Summer 2003, 301–329.CrossRefGoogle Scholar
  20. Hearst, M. A. (1994). Multi-paragraph Segmentation of Expository Text. Proceedings of the ACL’94, Las Cruces, NM.Google Scholar
  21. Herring, S. C. and Nix, C. (1997). Is “Serious Chat” an Oxymoron? Academic vs. Social Uses of Internet Reply Chat. Presented at the American Association of Applied Linguistics, Orlando, FL.Google Scholar
  22. Herring, S. C. (1999). Interactional Coherence in CMC. In Proceeding of the 32nd Hawaii International Conference on System Science.Google Scholar
  23. Khan, F. M. (2002). Mining Chat-room Conversations for Social and Semantic Interactions.–011.pdf.
  24. Khan, M., Klavans, J. L., and Mckeown, K. R. (1998). Linear Segmentation and Segment Significance. In Proceedings of the 6th International Workshop of Very Large Corpora (WVLC-6), 197–205, Montreal, Quebec, Canada.Google Scholar
  25. Kjell, B., Woods, W. A., and Frieder, O. (1994). Discrimination of Authorship Using Visualization. Information Processing and Management, (30:1), 141–150.CrossRefGoogle Scholar
  26. Lewis, D. D. and Knowles, K. A. (1997). Threading Electronic Mail: A Preliminary Study. Information Processing and Management, (33:2), 209–217.CrossRefGoogle Scholar
  27. McDonald, D., Chen, H., Su, H., and Marshall, B. (2004). Extracting Gene Pathway Relations using a Hybrid Grammar: The Arizona Relation Parser. Bioinformatics, (20:18), 3370–3378.CrossRefGoogle Scholar
  28. Meho, L. (2006). E-Mail Interviewing in Qualitative Research: A Methodological Discussion. Journal of the American Society for Information Science and Technology, 57(10), 1284–1295.CrossRefGoogle Scholar
  29. Miller, G. A., Ed. (1990). WordNet: An On-line Lexical Database. International Journal of Lexicography, 3(4), 235–312.CrossRefGoogle Scholar
  30. Morris, J. (1988). Lexical Cohesion, the Thesaurus, and the Structure of Text. Technical Report CSRI 219, Computer System Research Institute, University of Toronto.Google Scholar
  31. Nahnsen, T., Uzuner, O., and Katz, B. (2005). Lexical Chains and Sliding Locality Windows in Content-based Text Similarity Detection. CSAIL Memo, AIM-2005–017.Google Scholar
  32. Nash, M. C. (2005). Cohesion and Reference in English Chatroom Discourse. In Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS’05).Google Scholar
  33. Nasukawa, T. and Nagano, T. (2001) Text Analysis and Knowledge Mining System. IBM Systems Journal, (40:4), 967–984.CrossRefGoogle Scholar
  34. Newman, P. S. (2002). Exploring Discussion Lists: Steps and Directions. In Proceeding of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries, 126–134.Google Scholar
  35. Osterlund, C. and Carlile, P. (2005) Relations in Practice: Sorting Through Practice Theories on Knowledge Sharing in Complex Organizations. The Information Society, (21), 91–107.CrossRefGoogle Scholar
  36. Paolillo, J. C. (2006). Conversational Codeswitching on Usenet and Internet Relay Chat. Computer-Mediated Conversation, S. Herring (Ed.).Google Scholar
  37. Radford, M. L. (2006). Encountering Virtual Users: A Qualitative Investigation of Interpersonal Communication in Chat Reference. Journal of the American Society for Information Science and Technology, 57(8), 1046–1059.CrossRefGoogle Scholar
  38. Resnik, P. (1995). Disambiguating Noun Groupings with Respect to WordNet Senses. In Proceedings of the Third Workshop on Very Large Corpora.Google Scholar
  39. Sack, W. (2001). Conversation Map: An Interface for Very Large-Scale Conversations. Journal of Management Information Systems, (17:3), 73–92.CrossRefGoogle Scholar
  40. Salton, G. and McGill, M. J. (1986). Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York, NY.zbMATHGoogle Scholar
  41. Schafer, J. (2002). Spinning the Web of Hate: Web-based Hate Propagation by Extremist Organizations. Journal of Criminal Justice and Popular Culture, (9:2), 69–88.Google Scholar
  42. Smith, M. A. and Fiore, A. T. (2001). Visualization Components for Persistent Conversations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 136–143.Google Scholar
  43. Soon, W. M., Ng, H. T., and Lim D. C. Y. (2001). A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Linguistics, (27), 521–544.CrossRefGoogle Scholar
  44. Spiegel, D. (2001). Coterie: A Visualization of the conversational dynamics within IRC. Online. Master’s Dissertation,
  45. Voorhees, E. M. (1993). Using WordNet to Disambiguate Word Senses for Text Retrieval. Annual ACM Conference on Research and Development in Information Retrieval, In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, Pennsylvania, United States, 171–180.Google Scholar
  46. Walther, J. B., Anderson, J. F., and Park, D. W. (1994). Interpersonal Effects in Computer-Mediated Interaction: A Meta-Analysis of Social and Antisocial Communication. Communication Research, Vol. 21, No. 4, 460–487.CrossRefGoogle Scholar
  47. Wasko, M, M. and Faraj, S. (2005). Why Should I Share? Examining Social Capital and Knowledge Contribution in Electronic Networks of Practice. MIS Quarterly (29:1), 35–57.CrossRefGoogle Scholar
  48. Xiong, R., Smith, M. A., and Drucker, S. M. (1998). Visualizations of Collaborative Information for End-Users. Technical Report MSRTR-98–52, Microsoft Research.Google Scholar
  49. Yee, K. P. (2002). Zest: Discussion Mapping for Mailing Lists. In CSCW 2002 Conference Supplement, ACM Press, 123–126.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Department of Management Information SystemsUniversity of ArizonaTusconUSA

Personalised recommendations