Event networks (tutorial): Difference between revisions

Revision as of 12:04, 3 August 2012

Note: this tutorial documents a visone functionality that will be in the next release (around September 2012).

The links in an event network encode time stamped interaction among actors, for instance, users sending emails to other users. There is an important difference to networks of relational states - such as friendship networks. To illustrate the difference, when two actors are friends of each other at some instant in time, then - if nothing happens in between - they are still friends in the very near future. In contrast, if someone sends and email to another person at some instant in time, then he/she does not necessarily send an email to the same person in the very next instant in time. Stated otherwise, relations like friendship have inertia (something has to happen to change them), while relational events mark time points of interaction.

This tutorial is a practically oriented, example based, "how-to" guide illustrating the import, transformation, visualization, and analysis of event networks with visone. More background on event networks can be found in

Carter T. Butts: A Relational event framework for social action. Sociological Methodology 38(1):155-200, 2008.
Ulrik Brandes, Jürgen Lerner, and Tom A. B. Snijders: Networks Evolving Step by Step: Statistical Analysis of Dyadic Event Data. Proc. 2009 Intl. Conf. Advances in Social Network Analysis and Mining (ASONAM 2009), pp.200-205. IEEE Computer Society, 2009.

and in other papers linked in the references.

Please address questions and comments about this tutorial to me (Jürgen Lerner).

Example data: networks of political conflict and cooperation

This tutorial uses for illustration networks of events among political actors that have been collected by the Penn State Event Data Project (formerly Kansas Event Data System). Specifically, we use data encoding events in or around the Persian Gulf region in the time from 1979 to 1999. This data set is described in and linked from the page on Penn State Event Data. To follow the steps outlined in this tutorial you should download the file Gulf_events_preprocessed.zip.

Another specific application area for event networks - using different example data - is treated in the tutorial on Wikipedia edit networks.

Importing event networks

visone can import event lists from comma-separted-value (CSV) files. These files must contain a header in the first line (giving the column labels) followed by any number of lines each of which encodes one event. For instance, some lines in the example file look like this.

 "WEIS.code";"Time";"Source";"Target";"Description";"Goldstein.weight";"Type"
 ...
 222;980213;"ISR";"WES";"NONMIL DESTR";-8.7;"conflict"
 223;920717;"SYR";"ISR";"MIL ENGAGEME";-10;"conflict"
 ...

To open such a file, click on open in the file menu, select files of type: event list files (.csv, .txt), navigate to the file that you want to open, and click on ok. In the import options dialog (see below) you have to specify the character that separates the different entries in each line - this is the semicolon (;) in our example file - and a character enclosing text (if any) - this is the double quotes (") in our example file.

To find out the right settings you can look at the file tab in the import options dialog showing you part of the input file.

visone can now read the various entries of the input file - and you have to specify how these should be mapped to the resulting network in the dialog EventNetwork specification (shown below). Concretely you have to specify how the various components of an event are encoded in the file (Event format tab); how to iterate over the network sequence (Event iterator tab); how the events are mapped to the network's link attributes (Event network tab); and, if desired, which statistics should be computed while constructing the event network (Eventnet statistics tab). The tabs should be filled out in the order as they are numbered in the dialog since choice-possibilities for the latter tabs depend on previous settings. If you make changes in some tab you have to subsequently set (again) the values for the latter tabs.

Event format

In the event format tab (see the image below) you first have to specify which columns of the input file hold the information about the five components of an event (these are source, target, time, type, and weight). In our example, you can set the values as in the image below. The meaning of the five components is explained in the following.

SOURCE The source actor is the one who initiates the event.
TARGET The target actor is the one who receives the event.
TIME The time denotes when the event happened. visone supports a wide range of time encodings - from numeric times to strings representing calendar date and time in more common or less common formats. Furthermore, a time unit can be specified that defines the precision of the time variable.
TYPE The event type is a categorical variable specifying what happened. In our example, there are different choices for event types. One possibility is the rather coarse distinction between cooperative (positive) and conflictive (negative) events. The other possibility is to distinguish between all more than 100 different WEIS event types. An intermediate possibility (and that's what we are going to do in the following) is to use just the distinction between conflict and cooperation but to distinguish quantitatively between "strong" events and "weak" events by the event weight. For instance, the use of military force is counted more seriously than a warning - even though both are conflictive events.
WEIGHT The event weight is a numeric variable quantifying the intensity of the event with respect to the event type (see the example above). For instance, military engagement has a weight of -10.0 while warnings have a weight of -3.0.

After these five components have been chosen visone needs some information about the interpretation of time. The first choice is the selection between numeric time (if the time fields correspond to integer numbers) or calendar time (if time fields can somehow, specified below, be turned into a date/time). We have calendar time in our example.

If time is given by calendar, a time format pattern has to be specified. visone proposes some known pattern - among others the pattern yyMMdd which is appropriate for the KEDS event times. (This pattern implies that there are two digits for the year, followed by two digits for the month, followed by two digits for the day of the month; for instance, 940930 for September 30, 1994.) You can enter other than the proposed patterns in the textfield if date/time is formatted differently (see the webpage on the java class SimpleDateFormat for guidance). visone assists you in finding the right pattern by showing some date/time strings as they appear in the file and - whenever you select a date format pattern - the dialog shows you the current time formatted by the specified pattern.

Finally, you have to specify a time unit. If time is numeric you have to enter an integer in the textfield. If time is given by calendar you can select a "natural" time unit from Millisecond to Year. An appropriate time unit makes the iteration over the event sequence (and potentially the decay of link attributes over time) more intuitive. When computing event network statistics, events that happen within the same time unit are treated as independent of each other. The time of the KEDS events is given by the day. Thus, appropriate time units are DAY or coarser.

Note that the only required information are the columns containing the source and target - for the other components you can take default values (by selecting <implied> instead of a column header). The default value for the event type is the string EVENT (taking this default type means that there is no variation in event types - all have the same type); the default weight is equal to 1.0; the default event time is the row number in the input file (so that only the order of events is taken into account).

When all settings in the event format tab are done, you can create the list of events by clicking on the Apply (create events) button. A message informs about the number of events and the number of time units from the first to the last event. (The events are sorted in ascending order by time after reading them - thus, it is not necessary that the events are ordered by time in the input file.)

Event iterator

In the event iterator tab (see below) you have to specify the start and end time of the time interval to be processed and the delay between network snapshots.

When the events have been created after filling out the event format tab (see the preceeding section) visone suggests as start time the time of the first event and as end time the time of the last event. If you don't want to process the whole event sequence you can increase the start time and/or decrease the end time. After clicking on the upper Apply / get info button, visone informs you about the number of events and time units in the specified subsequence. You might just take all events by not changing the interval borders; this includes all events from April 15, 1979 to March 31, 1999 - as can be seen in the dialog.

Then you have to choose the time points when a network snapshot is to be created by specifying the delay between snapshots. You can see in the dialog that the event sequence spans more than 7,200 time units (i.e., days with the current settings) which is almost 20 years. The number of snapshots must be small (some 10 or 20 snapshots are ok), since they are all opened in a new tab in visone. When we want to create a snapshot once a year we specify create snapshots after every 365 time unit(s). (The number of snapshots is then 20.) visone always creates one snapshot at the end of the event sequence - even if the waiting time is less than the specified number.

Event network

The tab to specify the event network is the most important one - here you define which link attributes of the event network summarize the past events, how events of various types add to these attributes, and how they change over time.

The first thing to do is to decide on the link attributes. Here you are free to choose any attribute name (that makes it easy to remember the intuition of the attribute). Furthermore, a halftime - defining how fast attributes decay over time - has to be specified. The halftime has the following effect: when a particular link attribute on a particular dyad (pair or actors) has a value of $x$ at time $t$ , then (if no event on the same dyad happens in between) the value is $x/2$ at time $t+halftime$ . Intuitively, link attributes with a positive halftime capture recent interaction; if the halftime gets shorter then they capture even more recent interaction. A halftime equal to zero or negative indicates that the respective attribute does not decay over time; these attributes capture past interaction irrespective of the elapsed time.

In our concrete example we choose the following link attributes that all have a halftime of (approximately) one year.

An attribute cooperation sums up the weights of past cooperative events.
The link attribute conflict is similar and sums up past conflictive events. This attribute will also be non-negative; that is, a higher value means more past/recent conflict. (See later how this is achieved.)
Interaction sums up the strength of past events - irrespective of whether these are cooperative or conflictive.
Interaction (unweighted) sums up the number of past events - irrespective of whether these are cooperative or conflictive and irrespective of their weight.
Finally cooperation-conflict sums up the (positive) weights of cooperative events and the (negative) weights of conflictive events. This attribute is positive on dyads that have more cooperative events (or cooperative events with higher weights) and it is negative on dyads on which there are more conflictive events (or more serious conflictive events).

When the link attributes are added (e.g., click on the Add / update all button) you have to specify how the events contribute to them. Clicking on Create weight-function table builds a table that has one row for each link attribute and one column for each event type. In the cell indexed by an attribute $attr$ and an event type $t$ you specify the function mapping weights of events of type $t$ to increments of the link attribute $attr$ . In our example, selecting the function Identity in the cell indexed by attribute cooperation and event type cooperation means that whenever an event of type cooperation and weight $w$ happens then you add $w$ to the current value of the cooperation attribute. If we had chosen SquareRoot as the weight function in the same cell, then we would add ${\sqrt {w}}$ to the cooperation attribute whenever an event of type cooperation and weight $w$ happens. The weight-function identifier N/A means that events of that type do not cause any change of the respective attribute. For instance, events of type conflict do not change the attribute cooperation. Note that for the attribute conflict and the type conflict we choose the weight function MinusIdentity; thus, when a conflictive event with weight -10 happens we add the value 10 to the attribute conflict. The settings for all attributes and types can be seen in the above image.

Statistical modeling of the conditional event type or weight

Event statistics

References

Carter T. Butts: A Relational event framework for social action. Sociological Methodology 38(1):155-200, 2008.
Ulrik Brandes, Jürgen Lerner, and Tom A. B. Snijders: Networks Evolving Step by Step: Statistical Analysis of Dyadic Event Data. Proc. 2009 Intl. Conf. Advances in Social Network Analysis and Mining (ASONAM 2009), pp.200-205. IEEE Computer Society, 2009.

@@ Line 78: / Line 78: @@
 * '''Interaction (unweighted)''' sums up the number of past events - irrespective of whether these are cooperative or conflictive and irrespective of their weight.
 * Finally '''cooperation-conflict''' sums up the (positive) weights of cooperative events and the (negative) weights of conflictive events. This attribute is positive on dyads that have more cooperative events (or cooperative events with higher weights) and it is negative on dyads on which there are more conflictive events (or more serious conflictive events).
+When the link attributes are added (e.g., click on the '''Add / update all''' button) you have to specify how the events contribute to them. Clicking on '''Create weight-function table''' builds a table that has one row for each link attribute and one column for each event type. In the cell indexed by an attribute <math>attr</math> and an event type <math>t</math> you specify the function mapping weights of events of type <math>t</math> to increments of the link attribute <math>attr</math>. In our example, selecting the function ''Identity'' in the cell indexed by attribute ''cooperation'' and event type ''cooperation'' means that whenever an event of type ''cooperation'' and weight <math>w</math> happens then you add <math>w</math> to the current value of the ''cooperation'' attribute. If we had chosen ''SquareRoot'' as the weight function in the same cell, then we would add <math>\sqrt{w}</math> to the ''cooperation'' attribute whenever an event of type ''cooperation'' and weight <math>w</math> happens. The weight-function identifier ''N/A'' means that events of that type do not cause any change of the respective attribute. For instance, events of type ''conflict'' do not change the attribute ''cooperation''. Note that for the attribute ''conflict'' and the type ''conflict'' we choose the weight function ''MinusIdentity''; thus, when a conflictive event with weight -10 happens we add the value 10 to the attribute ''conflict''. The settings for all attributes and types can be seen in the above image.
 == Statistical modeling of the conditional event type or weight ==

Event networks (tutorial): Difference between revisions

Revision as of 12:04, 3 August 2012

Contents

Example data: networks of political conflict and cooperation