Wikipedia edit networks (tutorial)
The edit network associated with the history of Wikipedia pages is a network whose nodes are the page(s) and all contributing users and whose edges encode time-stamped, typed, and weighted interaction events (edit events) between users and pages and between users and users. Specifically, edit events encode the exact time when an edit has been done along with one or several of the following types of edit interaction:
- the amount of new text that a user adds to a page;
- the amount of text that a user deletes (along with the other user/s that has/have previously added this text);
- the amount of previously deleted text that a user restores (along with the users that previously deleted and the ones that originally added the text).
Together these edit events form a highly dynamic network revealing the emergent collaboration structure among contributing users. For instance, it can be derived
- who are the users that contributed most of the text;
- what are the implicit roles of users (e.g., contributors of new content, vanalism fighters, watchdogs);
- whether there are opinion groups, i.e., groups of users that mutually fight against each others edits.
This tutorial is a practically oriented "how-to"-guide giving an example based introduction to the computation, analysis, and visualization of Wikipedia edit networks. More background can be found in the papers cited in the references. To follow the steps outlined here (or to do a similar study) you should download WikiEvent - a small graphical java software with which the Wikipedia edit networks can be computed.
How to download the edit history?
Computing the edit network
The structure of edit network data
Analysis and visualization of edit networks
Statistical modeling of edit event networks
The discussion network
References
Published papers that propose and/or make use of Wikipedia edit networks include the following.
- Jürgen Lerner, Ulrik Brandes, Patrick Kenis, and Denise van Raaij: Modeling Open, Web-based Collaboration Networks: The Case of Wikipedia. In Markus Gamper, Linda Reschke, Michael Schönhuth (Eds.): Knoten und Kanten 2.0, pp 141-162. transcript-Verlag, 2012.
- Jürgen Lerner, Patrick Kenis, Denise van Raaij and Ulrik Brandes: Will they stay or will they go? How network properties of WebICs predict dropout rates of valuable Wikipedians. European Management Journal, 29(5):404-413, 2011.
- Ulrik Brandes, Patrick Kenis, Jürgen Lerner, and Denise van Raaij: Network Analysis of Collaboration Structure in Wikipedia. Proc. 18th Intl. World Wide Web Conference (WWW 2009).
More technical details about the computation of Wikipedia edit networks can be found in
- Ulrik Brandes, Patrick Kenis, Jürgen Lerner, and Denise van Raaij: Computing Wikipedia Edit Networks. Technical Report, 2009.