Abstract
This paper presents research relevant to predicting future editing by Wikipedia editors. We demonstrate the importance of each characteristic and attempt to clarify the characteristics that affect prediction. Clarifying this can help the Wikimedia Foundation (WMF) understand the editor’s actions. This research adopted the increase in prediction errors as the means of evaluating the importance of a characteristic and thus computed the importance of each characteristic. We used random forest (RF) regression for calculating the importance. Characteristic evaluation in our experiment revealed that the past number of edits and the editing period increased predictive accuracy. Furthermore, information regarding earlier edit actions clearly contains factors that determine future edit actions.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Liere, D., Fung, H. (eds.): Trends Study, http://strategy.wikimedia.org/wiki/March_2011_Update (update March 11, 2011, last accessed at December 29, 2011)
Wikipedia’s Participation Challenge, http://www.kaggle.com/c/wikichallenge (last accessed at January 13, 2012)
Suh, B., Convertino, G., Chi, E.H., Pirolli, P.: The singularity is not near: Slowing growth of Wikipedia. In: Proceedings of the 2009 International Symposium on Wikis (WikiSym), Orlando, FL, USA (2009)
Herring, K.T.: Wikipedia Participation Challenge Solution, http://meta.wikimedia.org/wiki/Research:Wiki_Participation_Challenge_Ernest_Shackleton (last accessed at December 29, 2011)
Zhang, D.: Wikipedia Edit Number Prediction based on Temporal Dynamics Only, http://arxiv.org/abs/1110.5051 (last accessed at December 29, 2011)
Yoshida, Y., Ohwada, H.: Wikipedia Edit Number Prediction from the Past Edit Record Based on Auto-Supervised Learning. In: Proceedings of the 2012 International Conference on Systems and Informatics (2012)
Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In: Proceedings of the 16th International Conference on World Wide Web (WWW), Banff, Alberta, Canada, pp. 211–220 (2007)
Lin, Y.-R., Sundaram, H., Chi, Y., Tatemura, J., Tseng, B.L.: Detecting splogs via temporal dynamics using self-similarity analysis. ACM Transactions on the Web (TWEB) 2(1), 1–35 (2008)
Zhang, D., Mao, R., Li, W.: The recurrence dynamics of social tagging. In: Proceedings of the 18th International Conference on World Wide Web (WWW), Madrid, Spain, pp. 1205–1206 (2009)
Abel, F., Gao, Q., Houben, G.-J., Tao, K.: Analyzing temporal dynamics in twitter profiles for personalized rec- ommendations in the social web. In: Proceedings of the 3rd International Conference on Web Science (WebSci), Koblenz, Germany (2011)
Breiman, L.: Random Forest. Machine Learning 45(1), 5–32 (2001)
Strobl, C., Boulesteix, A., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinformatics 9, 307 (2008)
Breiman, L., Cutler, A., Liaw, A., Wiener, M.: Breiman and Cutler’s Random Forests for Classification and Regression R package version 4.6-2 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yoshida, Y., Ohwada, H. (2012). Identifying Important Factors for Future Contribution of Wikipedia Editors. In: Richards, D., Kang, B.H. (eds) Knowledge Management and Acquisition for Intelligent Systems. PKAW 2012. Lecture Notes in Computer Science(), vol 7457. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32541-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-32541-0_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32540-3
Online ISBN: 978-3-642-32541-0
eBook Packages: Computer ScienceComputer Science (R0)