Difference between revisions of "Visualization and analysis (tutorial)"
(→Importing attributes from CSV tables)
|Line 86:||Line 86:|
== Mapping attributes to graphics: numerical attributes ==
== Mapping attributes to graphics: numerical attributes ==
Revision as of 12:45, 27 January 2011
This trail shows you how analysis and visualization goes hand in hand in visone. It introduces you to the most common usage scenario: importing data from one or several files, analyzing the network, visualizing the network together with the computed indicators, exporting data and images for further processing or publication.
- 1 Introducing an exemplary dataset
- 2 Importing networks from adjacency matrix files
- 3 Merging parallel ties
- 4 Importing attributes from CSV tables
- 5 Mapping attributes to graphics: categorical attributes
- 6 Computing network analytic indicators: centrality
- 7 Mapping attributes to graphics: numerical attributes
- 8 Layout algorithms
Introducing an exemplary dataset
The data that we use in this trail has been collected in a long-term reseach project about acculturation networks. More information about the project can be found at . Among others, the personal networks of now more than 1,000 immigrants have been collected within this project. Each of the respondents (called ego) provided answers to four types of questions:
- questions about ego, including country of origin, years of residence, age, gender, skin color, reasons for migrating, health, language skills...
- alters a list of persons known to ego (for most networks the number has been fixed to 45)
- questions about alters including country of origin, country of residence, age, skin color, type of relation to ego, ...
- alter-alter ties (undirected) pairs of alters that know each other (according to the respondent)
In this trail we exemplarily analyze one of these personal networks obtained from interviewing a migrant from the Dominican Republic to the USA. The dataset here contains none of the variables characterizing ego but only the alter characteristics and the alter-alter ties.
More specifically the ties are encoded in an adjacency matrix file Egonet_ties.csv, the alter characteristics in a file Egonet_attributes.csv and both of these two kinds of information are provided (more comfortable and reliable) in a GraphML file Egonet.graphml.
- alter-alter ties (Egonet_ties.csv)
- alter characteristics (Egonet_attributes.csv)
- GraphML file (Egonet.graphml)
To follow the steps explained in this trail, you should download these three files and save them on your hard disk (right-click and select save link as).
Importing networks from adjacency matrix files
The usual way to get a network into visone is to read it from a local file via the menu file, open
The usual file type to be read by visone is GraphML; GraphML files contain information about nodes and links, about attributes of nodes and links, and about graphical information such as layout, color, or shape. To read GraphML files you select .graphml in the file open dialog (shown below) and click on ok; this is simple, fast, and reliable.
Here, for illustration, we go the hard way and assume that the data are not stored in a GraphML file but in comma-separated-value tables. This very primitive file type can be output from many programs, including statistical software, spread-sheet editors, or other network analysis software. Sometimes you have to deal with this file type.
To open a network from an adjacency matrix file you select the type .txt, .csv in the file open dialog and click on ok. To follow the steps outlined in this trail, select the file egonet_ties.csv.
Clicking on ok does not immediatelly open the file. Indeed, in contrast to GraphML, CSV files don't have a self-explaining interpretation; rather the program that has to handle them needs some guidance. Therefore visone opens an import options dialog whose two tabs are shown below.
The file view tab shows you (part of) the adjacency matrix encoded in the file to be opened. From this view you can guess, for instance, that different cells in the matrix are delimited by semicolons (;), that row and column labels are present, and some more. For an exhaustive explanation of all options and their meaning see the page on the import options dialog. To continue with this trail, set all options as shown in the format tab above and click on ok. This opens a network looking like this.
The .csv does not contain layout information. The position of the nodes has been determined by the layout algorithm that can be initiated with the quick layout button.
Apart from GraphML and CSV format, visone can also open files in UCINET's .dl format, in Pajek's .net format, and some more.
Merging parallel ties
The network above contains for every pair of actors that are connected two anti-parallel ties. This is due to the fact that adjacency matrices are always interpreted as encoding directed graphs. This interpretation is wrong in our example since the tie-generating question was "do actor A and actor B know each other?" which clearly generates an undirected relation. All pairs of anti-parallel directed links can be merged to one undirected link via the transformation tab.
Therefore chose links as the level on which the transformation should be applied, merge as the operation, and chose contrary directed in the drop-down menu right of merge. Clicking on transform! at the bottom of the tab executes the transformation and the network has been transformed into an undirected one with no parallel links.
Since now we have already invested some work in the network, we might save it by clicking on file, save. (The first time we do this we have to assign a name to the network.) Note that the network is saved in GraphML format; indeed only this format guarantees that no information gets lost.
Importing attributes from CSV tables
Currently, the nodes have only one attribute called id. The values of other attributes are provided in the file Egonet_attributes.csv. This file can be merged into our current network via the attribute manager which can be started by clicking on the icon in visone's toolbar.
In the attribute manager, choose the nodes radio-button in the top row, import & export on the left, and select the file Egonet_attributes.csv that you have previously downloaded to your computer.
Before clicking on apply it is very important to correctly set the value in the join by drop-down menu. This should point to the name of the attribute that identifies the nodes and tells visone which column in the imported CSV file holds these identifiers. Currently you can only select id but in general there might be several node attributes and the nodes could as well be identified by attributes having a different name than id. Clicking on apply opens an import options dialog; setting the options as shown to the left - in particular, setting the cell delimiter to semicolon (;) - and clicking on ok imports the attributes. You can see the result by checking the values radio button on the left hand side of the attribute manager. Then, you should see something similar to the image on the right.
Mapping attributes to graphics: categorical attributes
For many networks it is extremly insightful to show the attribute values in the network image. In visone, attributes can be mapped to graphical variables via the visualization tab.
In the visualization tab you can choose between plenty of options; these have to be set from top to bottom. For the category chose mapping since we want to map existing attributes to graphical variables. (The other two options are layout for applying a visualization algorithm and geometry for doing affine transformations, such as rotating or scaling.)
The type of the mapping refers to the type of the graphical variable that is used for encoding attribute values. This choice is restricted by the type of the attribute that is to be mapped. Numerical attributes can be mapped to graphical variables that allow the user to recognize that one actor has a larger value than another one; examples of such variables are size or position whose usage is demonstrated further down in this trail. In this section, however, we want to encode a categorical variable - more specifically, the country of origin of the actor. A good choice for a graphical variable to encode this information is color which we select in the drop-down menu right of type. For property chose node color.
The drop-down menu right of node value should be set to the name of the attribute that is to be encoded; we chose Afrm (meaning country of origin of the actor). Finally, selecting color table for the method option presents you a table with the different values of the Afrm attribute together with a predefined choice of colors that you can change as you wish. Clicking on the visualize! button at the bottom of the visualization tab applies the mapping whose result is shown in the network area of the visone window.