Abstract
This paper investigates how Wikibooks authors collaborate to create high-quality books. We combined Information Retrieval and statistical techniques to examine the complete multi-year lifecycle of over 50 high-quality Wikibooks. We found that: 1. The presence of redundant material is negatively correlated with collaboration mechanisms; 2. For most books, over 50% of the content is written by a small core of authors; and 3. Use of collaborative tools (predicted pages and talk pages) is significantly correlated with patterns of redundancy. Non-redundant books are well-planned from the beginning and require fewer talk pages to reach high-quality status. Initially redundant books begin with high redundancy, which drops as soon as authors use coordination tools to restructure the content. Suddenly redundant books display sudden bursts of redundancy that must be resolved, requiring significantly more discussion to reach high-quality status. These findings suggest that providing core authors with effective tools for visualizing and removing redundant material may increase writing speed and improve the book’s ultimate quality.
Chapter PDF
Similar content being viewed by others
References
Brooks, F.P.: The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley, Reading (1995)
Chesney, T.: An empirical examination of Wikipedia’s credibility. First Monday 11(11) (2006)
Chevalier, F., Dragicevic, P., Bezerianos, A., Fekete, J.D.: Using text animated transitions to support navigation in document histories. In: Proc. CHI, pp. 683–692. ACM, New York (2010)
Clearwater, S.H., Huberman, B.A., Hogg, T.: Cooperative solution of constraint satisfaction problems. Science 254, 1181–1183 (1991)
Emigh, W., Herring, S.C.: Collaborative authoring on the web: A genre analysis of online encyclopedias. In: Proc. HICSS (2005)
Encyclopedia Britannica Inc.: Fatally flawed: refuting the recent study on encyclopedic ac- curacy by the journal Nature (March 2006)
Gastwirth, J.L.: The estimation of the Lorenz curve and Gini index. The Review of Economics and Statistics 54(3), 306–316 (1972)
Giles, J.: Internet encyclopedia as go head to head. Nature 438, 900–901 (2005)
Goldstein, J., Mittal, V., Carbonell, J., Kantrowitz, M.: Multi-document summarization by sentence extraction. In: Proc. NAACL-ANLP, pp. 40–48. ACL (2000)
Gutwin, C., Benford, S., Dyck, J., Fraser, M., Vaghi, I., Greenhalgh, C.: Revealing delay in collaborative environments. In: Proc. CHI, pp. 503–510. ACM, New York (2004)
Hill, G.W.: Group versus individual performance: are n+1 heads better than one? Psychological Bulletin 91, 517–539 (1982)
Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data 2(2), 1–25 (2008)
Kittur, A., Chi, E.H., Suh, B.: Crowdsourcing user studies with mechanical turk. In: Proc. CHI, pp. 453–456. ACM, New York (2008)
Kittur, A., Kraut, R.E.: Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In: Proc. CSCW, pp. 37–46. ACM, New York (2008)
Kittur, A., Lee, B., Kraut, R.E.: Coordination in collective intelligence: the role of team structure and task interdependence. In: Proc. CHI, pp. 1495–1504. ACM, New York (2009)
Kittur, A., Suh, B., Pendleton, B.A., Chi, E.H.: He says, she says: conflict and coordination in Wikipedia. In: Proc. CHI, pp. 453–462. ACM, New York (2007)
Kowalski, G.: Information retrieval systems: theory and implementation. Kluwer Academic, Dordrecht (1997)
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Processes 25, 259–284 (1998)
Lerner, J., Pathak, P.A., Tirole, J.: The dynamics of open-source contributors. American Economic Review 96(2), 114–118 (2006)
Li, L., Zhou, K., Xue, G.R., Zha, H., Yu, Y.: Enhancing diversity, coverage and balance for summarization through structure learning. In: Proc. WWW, pp. 71–80. ACM, New York (2009)
Li, Y., McLean, D., Bandar, Z.A., O’Shea, J.D., Crockett, K.: Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18(8), 1138–1150 (2006)
Lih, A.: Wikipedia as participatory journalism: reliable sources? Metrics for evaluating collaborative media as a news resource. In: Proc. ISOJ, pp. 16–17 (2004)
Mackay, W.E.: Patterns of sharing customizable software. In: Proc. CSCW, pp. 209–221. ACM, New York (1990)
Nardi, B.A., Miller, J.R.: Twinkling lights and nested loops: distributed problem solving and spreadsheet development. Int. J. Man-Mach. Stud. 34(2), 161–184 (1991)
Panciera, K., Halfaker, A., Terveen, L.: Wikipedians are born, not made: a study of power editors on wikipedia. In: Proc. GROUP, pp. 51–60. ACM, New York (2009)
Pedersen, T., Patwardhan, S.: Wordnet:similarity - measuring the relatedness of concepts. In: Proc. AAAI, pp. 1024–1025 (2004)
Raymond, E.S.: The Cathedral and the Bazaar. O’Reilly, Sebastopol (2001)
Sajjapanroj, S., Bonk, C.J., Lee, M.M., Lin, M.F.: The challenges and successes of wikibookian experts and Wikibook novices: Classroom and community collaborative experiences. In: Proc. AERA (2007)
Steiner, I.D.: Group process and productivity. Academic Press, London (1972)
Stewart, G.L.: A meta-analytic review of relationships between team design features and team performance. Journal of Management 32, 26–55 (2006)
Thagard, P.: Collaborative knowledge. Nous 31, 242–261 (1997)
Turney, P.D.: Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)
Viégas, F.B., Wattenberg, M., Dave, K.: Studying cooperation and conflict between authors with history flow visualizations. In: Proc. CHI, pp. 575–582. ACM, New York (2004)
Xiao, Y., Baker, P.B., O’Shea, P.M., Allen, D.W.: Wikibook as college textbook: a case study of college students’ participation in writing, editing and using a wikibook as primary course textbook. In: Proc. AERA (2007)
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proc. Machine Learning, pp. 412–420 (1997)
Zipf, G.K.: The Psychobiology of Language. Houghton-Mifflin, Boston (1935)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 IFIP International Federation for Information Processing
About this paper
Cite this paper
Liccardi, I., Chapuis, O., Yeung, CM.A., Mackay, W. (2011). Redundancy and Collaboration in Wikibooks. In: Campos, P., Graham, N., Jorge, J., Nunes, N., Palanque, P., Winckler, M. (eds) Human-Computer Interaction – INTERACT 2011. INTERACT 2011. Lecture Notes in Computer Science, vol 6946. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23774-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-23774-4_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23773-7
Online ISBN: 978-3-642-23774-4
eBook Packages: Computer ScienceComputer Science (R0)