Import options dialog: Difference between revisions

Revision as of 10:11, 18 May 2015

visone can import network data from comma-separated value (CSV) files. Since these do not come with an unequivocal specification of how to interpret them, some choices must be made. Therefore, whenever you open a CSV file via the file menu, visone shows you the import options dialog.

The import options dialog is able to handle four types of data formats: adjacency matrix, link list, adjacency list and node list. A brief introduction to those comma-separated value (CSV) files is given in the data input tutorial. Note that the import options dialog is designed to create nodes and links from a given data format. To add node and link attributes to a given network, use the attribute manager. A detailed description of adding attributes is also given in the data input tutorial.

To select the data format, use the topmost drop-down menu (data format). After selecting the data format, the import options dialog displays the appropriate options for interpreting the data. The following sections describe the data import for all four data formats in detail.

Adjacency matrix files

To open an adjacency matrix, use the file menu, click on open..., select files of type CSV files (.txt, .csv) in the file chooser, navigate to the file you want to open, and click on the ok button. Then the import options dialog opens (show below).

The semantics of the various options is explained in the following.

data format is used to disinguish between adjacency matrix files and other types of CSV files (here choose adjacency matrix).
network type can be one mode or two mode. In the adjacency matrix of a one mode network the rows and columns are indexed by the same set of nodes; for a two mode network (for instance, a network connecting authors to the articles they have written), the rows and columns are indexed by different sets of node (authors respectively articles in the example).
link attribute type can be decimal or text. The entries of the adjacency matrix (which are either numbers or character strings) are saved in a link attribute of the newly opened network; this option defines the type of this attribute (decimal for numerical attributes and text for categorical).
The check boxes row labels and header indicate whether the first column (respectively first row) lists the node identifiers (rather then entries of the adjacency matrix). If unchecked, then the node identifiers will be the numbers from $0$ to $n-1$ (when there are $n$ nodes in the network).
The check box directed edges is used to choose between directed and undirected networks.
The file format can be MS Excel, OpenOffice (default CSV output of these software programs, respectively), or user defined. If it is set to user defined you have to specify the following options.
cell delimiter defines the character that separates one matrix cell from the next. In the examples above, the cell delimiter is the semicolon (;) but it can as well be a comma, colon, TAB, or SPACE character.
textframe can be double quotes, quotes, or NONE. Textframes are necessary if the matrix-cell entries themselves contain the cell delimiter. (For instance, if the cell delimiter is SPACE and the row/column labels are "firstname lastname"; the quotes tell visone that the cell does not end after firstname.)
The merge empty cells checkbox tells visone whether repeated cell delimiters should be treated as one. This option is for instance necessary when reading the Newcomb Fraternity data (of which an excerpt is shown below)

  0  7 12 11 10  4 13 14 15 16  3  9  1  5  8  6  2
  8  0 16  1 11 12  2 14 10 13 15  6  7  9  5  3  4
 13 10  0  7  8 11  9 15  6  5  2  1 16 12  4 14  3
 ...

where the cell delimiter (the SPACE character) is sometimes repeated to enhance (human) readability.

The bottom part of the dialog shows the table in the way that it will be interpreted with the current setting. This part allows you to recognize whether the options are set correctly.

When you have set the options, click on the ok button to open the file.

Link list files

To open a link list, use the file menu, click on open..., select files of type CSV files (.txt, .csv) in the file chooser, navigate to the file you want to open, and click on the ok button. Then the import options dialog opens (show below).

Many options have the same meaning as when reading adjacency matrices (explained above). The most crucial difference is that in the first row of the preview area you select two specific columns, one containing the source of the link (indicated by the label source in the very first row) and one containing the target of the link (indicated by target). The other columns contain link attributes that you might choose to import (if set to enabled) or ignore (if set to disabled).

In the example above (see the tutorial on Wikipedia edit networks to learn more about this data), the column with the header ActiveUser contains the link source and the column labeled Target contains the link target. The other columns (WordCount, InteractionType, etc) hold the values of various link attributes that are newly created if not already in the network. Note that the type of these attributes can be set to be text, integer, decimal, etc in the second row of the preview area.

If the links in a link list have associated time-information (encoding when the interaction happened) - or if the order in the file is meaningful und could be interpreted in the sense that interaction on the begining of the file happened earlier - you might consider opening them as event list files (this is illustrated in the tutorial on event networks).

Adjacency list files

To open an adjacency list, use the file menu, click on open..., select files of type adjacency list files (.txt, .csv) in the file chooser, navigate to the file you want to open, and click on the ok button. Then the import options dialog opens (show below).

The header checkbox defines whether the first row is a header giving the numbers of nodes and links in the file (rather then the adjacency list of the first node).
node labels indicates whether the first column list the node identifiers (if unchecked, then nodes are numbered consecutively and the $i$ 'th row list the neighbors of the $i$ 'th node).
directed defines links are treated as directed or undirected.

The other options have the same meaning as when reading adjacency matrices (explained above).

Node list files

A node list is a list of all nodes with their attribute values and has the same format as an attribute table. If you select node list as data format, the import options dialog looks as shown below.

@@ Line 44: / Line 44: @@
 If the links in a link list have associated time-information (encoding when the interaction happened) - or if the order in the file is meaningful und could be interpreted in the sense that interaction on the begining of the file happened earlier - you might consider opening them as ''event list files'' (this is illustrated in the [[Event_networks_(tutorial)|tutorial on event networks]]).
 == Adjacency list files ==
@@ Line 56: / Line 57: @@
 The other options have the same meaning as when reading adjacency matrices (explained above).
+== Node list files ==
+A node list is a list of all nodes with their attribute values and has the same format as an attribute table. If you select ''node list'' as data format, the import options dialog looks as shown below.
+[[File:Import_options_node_list.png]]

Import options dialog: Difference between revisions

Revision as of 10:11, 18 May 2015

Contents

Adjacency matrix files

Link list files

Adjacency list files

Node list files

Navigation menu

Import options dialog: Difference between revisions

Revision as of 10:11, 18 May 2015

Adjacency matrix files

Link list files

Adjacency list files

Node list files

Navigation menu

Search