Difference between revisions of "RSiena (tutorial)"

From visone manual
Jump to navigation Jump to search
 
(17 intermediate revisions by the same user not shown)
Line 2: Line 2:
  
 
To follow the steps illustrated in this tutorial you should download the file [[Media:Classroom_graphmls.zip|Classroom_graphmls.zip]] and extract (unzip) its content (consisting of the network files <code>classroom_graph1.graphml</code> to <code>classroom_graph4.graphml</code>) to your hard disk. These files constitute the longitudinal network data explained on page [[Knecht_Classroom_(data)|Knecht Classroom (data)]].
 
To follow the steps illustrated in this tutorial you should download the file [[Media:Classroom_graphmls.zip|Classroom_graphmls.zip]] and extract (unzip) its content (consisting of the network files <code>classroom_graph1.graphml</code> to <code>classroom_graph4.graphml</code>) to your hard disk. These files constitute the longitudinal network data explained on page [[Knecht_Classroom_(data)|Knecht Classroom (data)]].
 +
 +
You find all the RSiena functionalities on the '''modeling''' tab on the right side of the visone window selecting ''siena'' in the '''modeling''' drop-down list. Note that first of all the longitudinal network data to be analysed has to be specified as described in the following paragraphs.
  
 
==Defining longitudinal network data==
 
==Defining longitudinal network data==
Line 18: Line 20:
 
[[File:Collection_manager_2.jpg|thumb|250px]]
 
[[File:Collection_manager_2.jpg|thumb|250px]]
 
In the right table '''available networks''' all networks that can be added to the selected network collection (the selected collection is indicated by the blue background) are listed. These are basically all currently open networks. Select one of them by clicking at its name and press '''<- add''' to add it to the seleted collection. The table '''networks in collection''' shows you all networks so far contained in the currently selected collection. Note that the top-down order in this table determines the order of the networks in the network collection, hence, has to correspond with their temporal order in the longitudinal network. If the current order is not as you want it, you can rearrange it by removing networks from the collection (clicking at their name and press '''remove ->''') and adding them again which will position them at the very end of the collection.
 
In the right table '''available networks''' all networks that can be added to the selected network collection (the selected collection is indicated by the blue background) are listed. These are basically all currently open networks. Select one of them by clicking at its name and press '''<- add''' to add it to the seleted collection. The table '''networks in collection''' shows you all networks so far contained in the currently selected collection. Note that the top-down order in this table determines the order of the networks in the network collection, hence, has to correspond with their temporal order in the longitudinal network. If the current order is not as you want it, you can rearrange it by removing networks from the collection (clicking at their name and press '''remove ->''') and adding them again which will position them at the very end of the collection.
 +
 +
To follow this tutorial connect <code>classroom_graph1.graphml</code> to <code>classroom_graph4.graphml</code> (in this order) to a network collection named ''classroom'' and set this new collection as active.
  
 
Visone knows now the networks that belong to the collection and their order but it does not know which node in the different networks correspond to each other, i.e., represent the same actor at different moments in time. This information has to be given by specifying an [[Identifying_attribute|identifying attribute]]. Candidates for being the identifying attribute are node attributes that are defined in all networks included in the collection. Further they have to be attributes that assign a unique value to each node in a network, i.e., there must not be two nodes in the same network with the same value in those attributes. Among the networks in the network collection nodes with the same value of the identifying attribute are identified with each other.  
 
Visone knows now the networks that belong to the collection and their order but it does not know which node in the different networks correspond to each other, i.e., represent the same actor at different moments in time. This information has to be given by specifying an [[Identifying_attribute|identifying attribute]]. Candidates for being the identifying attribute are node attributes that are defined in all networks included in the collection. Further they have to be attributes that assign a unique value to each node in a network, i.e., there must not be two nodes in the same network with the same value in those attributes. Among the networks in the network collection nodes with the same value of the identifying attribute are identified with each other.  
Line 34: Line 38:
 
[[File:covariates.jpg|thumb|300px]]
 
[[File:covariates.jpg|thumb|300px]]
  
You get to the siena functionalities by choosing on the \texttt{analyis} tab the \texttt{task} "siena". \\
+
As mentioned before, you find the RSiena functionalities on the '''modeling''' tab on the right side of the visone window where you select ''siena'' in the '''modeling''' drop-down list to get to the '''data specification''' tab.
Here, all actor and dyadic attributes that are available for the active network collection (\ref{sec:network_collection}) are listed. Specify for all attributes whether and as which type (e.g., constant or changing covariate) they should be regarded for analysis. The available types for each attribute depend on the sort of data. For instance, a node attribute (\ref{subsec:node_attributes}) that is not integer but decimal, cannot be used as a siena behavior attribute. Also, if a network collection contains only two observations, no node attribute can be used as a changing covariate (see \ref{rs-10} or \ref{ssb-10}).
+
The '''data specification''' tab offers you a list of all node attributes and dyad attributes that are defined in the first network of the active network collection and can be used as exogenous covariates in the model.
 +
 
 +
As you see in the right graphic, each attribute name is associated with a drop-down list that contains a choice of covariate types. Actually, this list contains exactly those covariate types that could be represented by the corresponding attribute depending on following conditions:
 +
- an attribute may represent a '''constant covariate''' if it is defined in the first network of the collection. In this case, the attribute values in the first network are assumed to be the values of the constant covariate in the siena model even if the attribute values in the other networks of the collection differ from the values in the first network.
 +
- an attribute may represent '''changing covariate''' if it is defined all but the last network of the collection. It can also be defined in the last network but it is not mandatory as covariates of the last network have no influence on the modeling anyway. Hence, in a network collection that consists of only two networks no changing covariate can be defined.
 +
 
 +
Especially for node attributes, an additional choice '''behavior''' might be available. If you you select this option, the node attribute will not represent an exogenous covariates but will be treated (together with the network) as a dependent variable that will be modeled itself.
 +
- a node attribute may represent a '''behavior variable''' if it is defined in all networks of the collection and it is of type ''integer'' (You can check and possibly change the ''type'' of an attribute in the attribute manager under configurations)
 +
 
 +
Specify for all attributes which type of covariate they should define or whether they should not be inluded (by selecting ''ignore'') in the model.
 +
 
 +
To follow this tutorial set ''gender'' as constant individual covariate and ''primary'' as constant dyadic covariate.
  
 
==Specifying missing data or structurally fixed values==
 
==Specifying missing data or structurally fixed values==
  
 
==Model specification and estimation==
 
==Model specification and estimation==
For specifying your model, press specify model on the siena tab. The following dialogue will open.
+
After the data specification (networks, behavior variables, covariates, missing data, structurally fixed values) is complete, the model can specified. To do so, visone provides the '''model specification''' dialogue which you open by pressing the '''specify model''' button at the bottom of the '''siena modeling''' tab.  
  
[[File:modelspecification.jpg|250px]]
+
[[File:modelspecification.jpg|600px]]
  
On the left all e�ffects are listed that are available for the active network collection
+
The left list in this dialogue contains all effects (if you do not know what ''effect'' in this context means, see the [http://www.stats.ox.ac.uk/~snijders/siena/SnijdersSteglichVdBunt2009.pdf  tutorial] on stochastic actor-oriented models) that are available for the active network collection with the above specified individual and dyadic covariates. For instance, we find the covariate related effect ''primary'' that takes the influence of having been together in the same primary school on the existence of a network tie into account. This effect would not have been in the list of available effect if we had not set ''primary'' as a dyadic attribute. Also the actor covariate related effects ''gender ego'',''gender alter'', and ''same gender'' can be added to the model only because we included gender as an individual covariate.
with the above speci�ed actor and dyadic attributes.
 
The right table contains e�ects that are currently included in the model. Shift e�ects
 
from left to right and vice versa to specify your e�ect selection.
 
For the estimation process, you can choose whether standard values or current parameter
 
values shall be used as initial values by ticking the corresponding checkbox. You can
 
also set the number of subphases in phase 2 and the number of iterations in phase 3 (see
 
[1]) by shifting the corresponding sliders.
 
  
[[File:estimates.jpg|thumb|250px]]
+
An available effect can be added to the model by selecting it (i.e., clicking at its name in the left list) and pressing button '''>>''' on the right side of the list. Immediately, the effect name disappears from the left list and appears in the right table which contains all effects that are currently included in the model. If you want do exclude an effect already included effect from the model, select it in the right table and press '''<<'''.
  
Parameter estimation starts when button estimate is pressed. The estimation progress
+
By ticking checkbox '''use standard initial values''' it can be set whether standard values or current parameter values shall be used as initial values in the estimation process. Furthermore, the number of subphases in the parameter estimation phase (''phase 2'') and the number of iterations in the standard error estimation phase (''phase 3'') (see the [http://www.stats.ox.ac.uk/~snijders/siena/RSiena_Manual.pdf RSiena manual]) can be set by shifting the corresponding sliders.
can be monitored in the Rserve (see 6) window (console). When the estimation is
+
 
�nished, the results are displayed in the right table of the model speci�cation dialogue.
+
You start the parameter estimation by pressing button '''estimate'''. The estimation progress can be monitored in the ''Rserve'' window (console). When the estimation is finished, the results are displayed in the right table of the model specification dialogue.
For each included e�ect its estimate, associated standard error and t-statistic (that
+
 
indicates the convergence of the estimation process, see [1]) are given. The p-values
+
[[File:estimates.jpg|700px]]
assume the null hypothesis that respective parameter values are 0 and are calculated by
+
 
 +
For each included effect its estimated '''parameter value''', associated '''standard error''', and '''t-statistic''' (that
 +
indicates the convergence of the estimation process, see the [http://www.stats.ox.ac.uk/~snijders/siena/RSiena_Manual.pdf RSiena manual]) are displayed. The '''p-value''' assumes the null hypothesis that respective parameter values is 0 and is computed by
 
the R command 2*pnorm(-abs(parameter estimates/standard errors)).
 
the R command 2*pnorm(-abs(parameter estimates/standard errors)).
It is also possible to test or �x certain e�ects by ticking the correspondig checkboxes.
+
It is also possible to test or fix certain effects by ticking the correspondig checkboxes.
When an e�ect was tested, its p-value results from the score-type test as described in
+
When an effect was tested, its p-value results from the score-type test as described in the[http://www.stats.ox.ac.uk/~snijders/siena/RSiena_Manual.pdf manual].
[1].
+
 
When the user closes the model speci�cation dialogue, he has the possibility to save the
+
When the '''model specfication''' dialogue is closed, he has the possibility to save the
RSiena output �le. This �le contains the usual RSiena output for estimation results.
+
RSiena output file is offered. This file contains the standard RSiena-output for estimation results.
  
 
==Simulation==
 
==Simulation==
 +
 +
It is also possible to simulate networks based on model predictions. By pressing the simulate button a number of networks is simulated based on current model specification and parameter estimates. The number of simulated networks equals the number of iterations in phase 3 as set by the user. For each pair of actors the average number of being linked in this simulations is calculated. The resulting ''tie probabilities'' are saved as an dyad attribute named '''tie probabilities'''.
  
 
==Visualize simulated netwoks==
 
==Visualize simulated netwoks==

Latest revision as of 14:56, 18 July 2012

This tutorial illustrates how to analyze longitudinal network data by using RSiena from within visone. We assume that you have installed R on your computer and configured the R connection as it is explained in the installation tutorial. We also assume that you have basic understanding about how to work with visone as it is, for instance, explained in the tutorial on visualization and analysis and basic knowledge of stochastic actor-oriented models (SOAM).

To follow the steps illustrated in this tutorial you should download the file Classroom_graphmls.zip and extract (unzip) its content (consisting of the network files classroom_graph1.graphml to classroom_graph4.graphml) to your hard disk. These files constitute the longitudinal network data explained on page Knecht Classroom (data).

You find all the RSiena functionalities on the modeling tab on the right side of the visone window selecting siena in the modeling drop-down list. Note that first of all the longitudinal network data to be analysed has to be specified as described in the following paragraphs.

Defining longitudinal network data

Stochastic Actor Oriented Models (SAOMs) are designed for analysing longitudinal network data given as network panel data, i.e., a sequence of networks representing one network observed at several moments in time. Such network panel data are encoded in files classroom_graph1.graphml to classroom_graph4.graphml. To load them click on the menu file, open, navigate in the file browser to the directory where you've put the files classroom_graph1.graphml to classroom_graph4.graphml and select all of them before you click the ok button. (Selection of these files can be done in different ways, for instance, by keeping the Control-key pushed while successively selecting the files with a mouse left-click or by clicking on one of the files and then typing Control-a to select all files in the current directory.)

The four networks should be shown in four separate tabs in the network area. However, visone does not yet know that they belong together as a longitudinal network. This information must be given by combining them to a network collection which is a collection or sequence of several networks that belong together, e.g., by building a longitudinal network. Basic application scenarios related to network collections are explained in the tutorial on network collections and dynamic networks.

A network collection can be defined in the network collection manager. To open the network collection manager press button Collection manager.png in visone's toolbar

Collection manager 1.jpg

Press create collection button to create a new collection. A new collection is named 'unknown network collection' by default. You can change the name by clicking in the editable field name and typing a new one.

Collection manager 2.jpg

In the right table available networks all networks that can be added to the selected network collection (the selected collection is indicated by the blue background) are listed. These are basically all currently open networks. Select one of them by clicking at its name and press <- add to add it to the seleted collection. The table networks in collection shows you all networks so far contained in the currently selected collection. Note that the top-down order in this table determines the order of the networks in the network collection, hence, has to correspond with their temporal order in the longitudinal network. If the current order is not as you want it, you can rearrange it by removing networks from the collection (clicking at their name and press remove ->) and adding them again which will position them at the very end of the collection.

To follow this tutorial connect classroom_graph1.graphml to classroom_graph4.graphml (in this order) to a network collection named classroom and set this new collection as active.

Visone knows now the networks that belong to the collection and their order but it does not know which node in the different networks correspond to each other, i.e., represent the same actor at different moments in time. This information has to be given by specifying an identifying attribute. Candidates for being the identifying attribute are node attributes that are defined in all networks included in the collection. Further they have to be attributes that assign a unique value to each node in a network, i.e., there must not be two nodes in the same network with the same value in those attributes. Among the networks in the network collection nodes with the same value of the identifying attribute are identified with each other.

Collection manager 3.jpg

The drop-down list identifying attribute offers you all available attributes for the current selection that meet the necessary conditions to serve as an identifying attribute. Note that in the current example you are only offered to choose id as the identifying attribute as all other attributes do not provide unique values for all nodes in a network.

While you can create a network collection even if some nodes are not present at all time points, a network collection is marked as being siena compatible if all nodes are present at all times. If a network collection is not siena compatible it cannot be modeled with RSiena but you can nevertheless compute a dynamic layout.

More than one network collection can be created. The analysis with RSiena, however, works on only one longitudinal network data, namely the data represented by the active network collection. At each moment, only one collection can be active which is indicated by the asterisk (*) in front of its name. You can switch the active collection with the set as active button in the collection manager.

Adding individual or dyadic covariates

Covariates.jpg

As mentioned before, you find the RSiena functionalities on the modeling tab on the right side of the visone window where you select siena in the modeling drop-down list to get to the data specification tab. The data specification tab offers you a list of all node attributes and dyad attributes that are defined in the first network of the active network collection and can be used as exogenous covariates in the model.

As you see in the right graphic, each attribute name is associated with a drop-down list that contains a choice of covariate types. Actually, this list contains exactly those covariate types that could be represented by the corresponding attribute depending on following conditions: - an attribute may represent a constant covariate if it is defined in the first network of the collection. In this case, the attribute values in the first network are assumed to be the values of the constant covariate in the siena model even if the attribute values in the other networks of the collection differ from the values in the first network. - an attribute may represent changing covariate if it is defined all but the last network of the collection. It can also be defined in the last network but it is not mandatory as covariates of the last network have no influence on the modeling anyway. Hence, in a network collection that consists of only two networks no changing covariate can be defined.

Especially for node attributes, an additional choice behavior might be available. If you you select this option, the node attribute will not represent an exogenous covariates but will be treated (together with the network) as a dependent variable that will be modeled itself. - a node attribute may represent a behavior variable if it is defined in all networks of the collection and it is of type integer (You can check and possibly change the type of an attribute in the attribute manager under configurations)

Specify for all attributes which type of covariate they should define or whether they should not be inluded (by selecting ignore) in the model.

To follow this tutorial set gender as constant individual covariate and primary as constant dyadic covariate.

Specifying missing data or structurally fixed values

Model specification and estimation

After the data specification (networks, behavior variables, covariates, missing data, structurally fixed values) is complete, the model can specified. To do so, visone provides the model specification dialogue which you open by pressing the specify model button at the bottom of the siena modeling tab.

Modelspecification.jpg

The left list in this dialogue contains all effects (if you do not know what effect in this context means, see the tutorial on stochastic actor-oriented models) that are available for the active network collection with the above specified individual and dyadic covariates. For instance, we find the covariate related effect primary that takes the influence of having been together in the same primary school on the existence of a network tie into account. This effect would not have been in the list of available effect if we had not set primary as a dyadic attribute. Also the actor covariate related effects gender ego,gender alter, and same gender can be added to the model only because we included gender as an individual covariate.

An available effect can be added to the model by selecting it (i.e., clicking at its name in the left list) and pressing button >> on the right side of the list. Immediately, the effect name disappears from the left list and appears in the right table which contains all effects that are currently included in the model. If you want do exclude an effect already included effect from the model, select it in the right table and press <<.

By ticking checkbox use standard initial values it can be set whether standard values or current parameter values shall be used as initial values in the estimation process. Furthermore, the number of subphases in the parameter estimation phase (phase 2) and the number of iterations in the standard error estimation phase (phase 3) (see the RSiena manual) can be set by shifting the corresponding sliders.

You start the parameter estimation by pressing button estimate. The estimation progress can be monitored in the Rserve window (console). When the estimation is finished, the results are displayed in the right table of the model specification dialogue.

Estimates.jpg

For each included effect its estimated parameter value, associated standard error, and t-statistic (that indicates the convergence of the estimation process, see the RSiena manual) are displayed. The p-value assumes the null hypothesis that respective parameter values is 0 and is computed by the R command 2*pnorm(-abs(parameter estimates/standard errors)). It is also possible to test or fix certain effects by ticking the correspondig checkboxes. When an effect was tested, its p-value results from the score-type test as described in themanual.

When the model specfication dialogue is closed, he has the possibility to save the RSiena output file is offered. This file contains the standard RSiena-output for estimation results.

Simulation

It is also possible to simulate networks based on model predictions. By pressing the simulate button a number of networks is simulated based on current model specification and parameter estimates. The number of simulated networks equals the number of iterations in phase 3 as set by the user. For each pair of actors the average number of being linked in this simulations is calculated. The resulting tie probabilities are saved as an dyad attribute named tie probabilities.

Visualize simulated netwoks