RSiena (tutorial)

From visone manual
Jump to navigation Jump to search

This tutorial illustrates how to analyze longitudinal network data by using RSiena from within visone. We assume that you have installed R on your computer and configured the R connection as it is explained in the installation tutorial. We also assume that you have basic understanding about how to work with visone as it is, for instance, explained in the tutorial on visualization and analysis and basic knowledge of stochastic actor-oriented models (SOAM).

To follow the steps illustrated in this tutorial you should download the file Classroom_graphmls.zip and extract (unzip) its content (consisting of the network files classroom_graph1.graphml to classroom_graph4.graphml) to your hard disk. These files constitute the longitudinal network data explained on page Knecht Classroom (data).

You find all the RSiena functionalities on the modeling tab on the right side of the visone window selecting siena in the modeling drop-down list. Note that first of all the longitudinal network data to be analysed has to be specified as described in the following paragraphs.

Defining longitudinal network data

Stochastic Actor Oriented Models (SAOMs) are designed for analysing longitudinal network data given as network panel data, i.e., a sequence of networks representing one network observed at several moments in time. Such network panel data are encoded in files classroom_graph1.graphml to classroom_graph4.graphml. To load them click on the menu file, open, navigate in the file browser to the directory where you've put the files classroom_graph1.graphml to classroom_graph4.graphml and select all of them before you click the ok button. (Selection of these files can be done in different ways, for instance, by keeping the Control-key pushed while successively selecting the files with a mouse left-click or by clicking on one of the files and then typing Control-a to select all files in the current directory.)

The four networks should be shown in four separate tabs in the network area. However, visone does not yet know that they belong together as a longitudinal network. This information must be given by combining them to a network collection which is a collection or sequence of several networks that belong together, e.g., by building a longitudinal network. Basic application scenarios related to network collections are explained in the tutorial on network collections and dynamic networks.

A network collection can be defined in the network collection manager. To open the network collection manager press button Collection manager.png in visone's toolbar

Collection manager 1.jpg

Press create collection button to create a new collection. A new collection is named 'unknown network collection' by default. You can change the name by clicking in the editable field name and typing a new one.

Collection manager 2.jpg

In the right table available networks all networks that can be added to the selected network collection (the selected collection is indicated by the blue background) are listed. These are basically all currently open networks. Select one of them by clicking at its name and press <- add to add it to the seleted collection. The table networks in collection shows you all networks so far contained in the currently selected collection. Note that the top-down order in this table determines the order of the networks in the network collection, hence, has to correspond with their temporal order in the longitudinal network. If the current order is not as you want it, you can rearrange it by removing networks from the collection (clicking at their name and press remove ->) and adding them again which will position them at the very end of the collection.

To follow this tutorial connect classroom_graph1.graphml to classroom_graph4.graphml (in this order) to a network collection named classroom and set this new collection as active.

Visone knows now the networks that belong to the collection and their order but it does not know which node in the different networks correspond to each other, i.e., represent the same actor at different moments in time. This information has to be given by specifying an identifying attribute. Candidates for being the identifying attribute are node attributes that are defined in all networks included in the collection. Further they have to be attributes that assign a unique value to each node in a network, i.e., there must not be two nodes in the same network with the same value in those attributes. Among the networks in the network collection nodes with the same value of the identifying attribute are identified with each other.

Collection manager 3.jpg

The drop-down list identifying attribute offers you all available attributes for the current selection that meet the necessary conditions to serve as an identifying attribute. Note that in the current example you are only offered to choose id as the identifying attribute as all other attributes do not provide unique values for all nodes in a network.

While you can create a network collection even if some nodes are not present at all time points, a network collection is marked as being siena compatible if all nodes are present at all times. If a network collection is not siena compatible it cannot be modeled with RSiena but you can nevertheless compute a dynamic layout.

More than one network collection can be created. The analysis with RSiena, however, works on only one longitudinal network data, namely the data represented by the active network collection. At each moment, only one collection can be active which is indicated by the asterisk (*) in front of its name. You can switch the active collection with the set as active button in the collection manager.

Adding individual or dyadic covariates

Covariates.jpg

As mentioned before, you find the RSiena functionalities on the modeling tab on the right side of the visone window where you select siena in the modeling drop-down list to get to the data specification tab. The data specification tab offers you a list of all node attributes and dyad attributes that are defined in the first network of the active network collection and can be used as exogenous covariates in the model.

As you see in the right graphic, each attribute name is associated with a drop-down list that contains a choice of covariate types. Actually, this list contains exactly those covariate types that could be represented by the corresponding attribute depending on following conditions: - an attribute may represent a constant covariate if it is defined in the first network of the collection. In this case, the attribute values in the first network are assumed to be the values of the constant covariate in the siena model even if the attribute values in the other networks of the collection differ from the values in the first network. - an attribute may represent changing covariate if it is defined all but the last network of the collection. It can also be defined in the last network but it is not mandatory as covariates of the last network have no influence on the modeling anyway. Hence, in a network collection that consists of only two networks no changing covariate can be defined.

Especially for node attributes, an additional choice behavior might be available. If you you select this option, the node attribute will not represent an exogenous covariates but will be treated (together with the network) as a dependent variable that will be modeled itself. - a node attribute may represent a behavior variable if it is defined in all networks of the collection and it is of type integer (You can check and possibly change the type of an attribute in the attribute manager under configurations)

Specify for all attributes which type of covariate they should define or whether they should not be inluded (by selecting ignore) in the model.

To follow this tutorial set gender as constant individual covariate and primary as constant dyadic covariate.

Specifying missing data or structurally fixed values

Model specification and estimation

After the data specification (networks, behavior variables, covariates, missing data, structurally fixed values) is complete, the model can specified. To do so, visone provides the model specification dialogue which you open by pressing the specify model button at the bottom of the siena modeling tab.

Modelspecification.jpg

The left list in this dialogue contains all effects (if you do not know what effect in this context means, see the tutorial on stochastic actor-oriented models) that are available for the active network collection with the above specified individual and dyadic covariates. For instance, we find the covariate related effect primary that takes the influence of having been together in the same primary school on the existence of a network tie into account. This effect would not have been in the list of available effect if we had not set primary as a dyadic attribute. Also the actor covariate related effects gender ego,gender alter, and same gender can be added to the model only because we included gender as an individual covariate.

An available effect can be added to the model by selecting it (i.e., clicking at its name in the left list) and pressing button >> on the right side of the list. Immediately, the effect name disappears from the left list and appears in the right table which contains all effects that are currently included in the model. If you want do exclude an effect already included effect from the model, select it in the right table and press <<.

By ticking checkbox use standard initial values it can be set whether standard values or current parameter values shall be used as initial values in the estimation process. Furthermore, the number of subphases in the parameter estimation phase (phase 2) and the number of iterations in the standard error estimation phase (phase 3) (see the RSiena manual) can be set by shifting the corresponding sliders.

You start the parameter estimation by pressing button estimate. The estimation progress can be monitored in the Rserve (see 6) window (console). When the estimation is finished, the results are displayed in the right table of the model specification dialogue. Estimates.jpg


For each included effect its estimate, associated standard error and t-statistic (that indicates the convergence of the estimation process, see [1]) are given. The p-values assume the null hypothesis that respective parameter values are 0 and are calculated by the R command 2*pnorm(-abs(parameter estimates/standard errors)). It is also possible to test or fix certain effects by ticking the correspondig checkboxes. When an effect was tested, its p-value results from the score-type test as described in [1]. When the user closes the model specfication dialogue, he has the possibility to save the RSiena output file. This file contains the usual RSiena output for estimation results.

Simulation

It is also possible to simulate network evolution. By pressing the simulate button a number of networks is simulated with current parameter estimates. The number of simulations is the number of iterations in phase 3 as set by the user. For each pair of actors the average number of being linked in this simulations is calculated. Resulting tie probabilities are saved as an dyad attribute (5.2) named tie probabilities. When the user closes the model specification dialogue, he has the possibility to save the RSiena output file. This file contains the usual RSiena output for estimation and simulation.

Visualize simulated netwoks