GraphML-Time

From visone manual
Jump to navigation Jump to search

This page is a working draft for a new GraphML version that allows to store time-dependent graphs and networks in GraphML files.

Note that the specification on this page does not (yet) describe an official GraphML release and is likely to change frequently and without notice.

Goal

The new GraphML version should provide the possibility to store time-dependent graphs and networks in GraphML files. The time-dependence should not be restricted to a series of snapshots of graphs but should allow for evolution in continuous time.

Design decisions

All GraphML elements can be indexed by a lifetime which is a subset of the timeline. The timeline is a linearly ordered set of timepoints, i.e., any two timepoints are either equal or one is less than the other (the smaller point in time happens earlier). Intuitively, the timeline can be thought of as a straight line going from left (past) to right (future). There are two variants of the timeline representing numeric time and calendar time. For numeric time the timeline is identical to the set of decimals. For calendar time the timeline is identical to the timeline of the XML datatype dateTime. There MAY be an affine linear mapping from the numerical timeline to calendar timeline by specifying an offset and a time unit. Numeric time does not need to represent physical time or astronomic time; it may also just be a counter for observation time points, etc.

The interpretation of the lifetime indexing GraphML elements is that the element exists for all time points and it does not exist for all time points . For elements the lifetime means that the particular attribute takes the value specified by for all time points ; it may take another value or it may be unspecified for all time points .

Time information is stored in XML attributes. The names of all time-attributes start with the string time.

Time attributes always restrict the lifetime of elements, that is:

  • If no time attributes (explictitly or implicitly) are associated with an element, then this element has an unrestricted lifetime, i.e., its lifetime is equal to the whole timeline.
  • Lifetimes default to lower elements in the document tree; for instance, the lifetime of a <node> can only be a subset of the lifetime of its parent <graph>.
  • The lifetime of elements having a keyref can only be a subset of the lifetime of the referred elements. For instance, an <edge> cannot live longer than its endpoints (<node>s) and the lifetime of a element can only be a subset of the lifetime of its <key>.

Two elements are said to compete at time if they

  1. refer to the same <key>;
  2. are siblings (children of the same element);
  3. and their lifetimes include .

To resolve such conflicts we specify that the value of an attribute for a particular element at a time point is defined by that element that comes last in the document. The intuition for this convention is that the document is read from the start to the end and whenever a data element is read it defines a value for some element and attribute at the specified lifetime - thereby potentially overriding previous values that have been set for the same element/attribute within this lifetime. Nevertheless, GraphML documents SHOULD avoid to include competing elements.

Datatypes for time information

There are two fundamental data types for time information in GraphML-Time: data types for time points and data types for durations.. Since guessing the data types from lexical representation is in general not possible, GraphML documents SHOULD explicitly state which data types they use for time points and durations by using the attributes time.point.type and time.duration.type (see below).

Time point data types

The specification of a simple type for time points is given below. It is a union of simple data types specified in the XML Schema specification Part 2. The first seven entries in the list below are for numeric time, the others for calendar time. The recommended data type for numeric time is xs:decimal and the recommended type for calendar time is xs:dateTime; the others should only be used if neither of the former two is appropriate.

 <xs:simpleType name="time.point.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:restriction base="xs:short"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:int"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:long"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:integer"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:decimal"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:float"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:double"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:dateTime"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:time"/> 
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:date"/> 
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gYearMonth"/> 
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gYear"/>  
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gMonthDay"/> 
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gDay"/> 
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:gMonth"/> 
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:string"/> <-- user-defined representation of date and/or time-->
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

We also need the simple data type list of data types for time points; this is achieved as below:

 <xs:simpleType name="time.points.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:list itemType="xs:integer"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:list itemType="xs:decimal"/>
     </xs:simpleType>
     ...
     <xs:simpleType>
        <xs:list itemType="xs:dateTime"/>
     </xs:simpleType>
     ..
     <xs:simpleType>
        <xs:list itemType="xs:string"/> 
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

Time duration data types

Data types for duration can be numeric, or of the XML data type xs:duration, or user-defined strings. Here the recommended data types are xs:decimal for numeric time and xs:duration for calendar time.

 <xs:simpleType name="time.duration.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:restriction base="xs:decimal"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:integer"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:long"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:int"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:short"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:float"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:double"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:duration"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:restriction base="xs:string"/> 
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

Again we need a type for lists of durations.

 <xs:simpleType name="time.durations.simpleType" final="#all">
   <xs:union>
     <xs:simpleType>
        <xs:list itemType="xs:decimal"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:list itemType="xs:integer"/>
     </xs:simpleType>
     ...
     <xs:simpleType>
        <xs:list itemType="xs:duration"/>
     </xs:simpleType>
     <xs:simpleType>
        <xs:list itemType="xs:string"/> 
     </xs:simpleType>
   </xs:union>
 </xs:simpleType>

Time attributes and their interpretation

Lifetimes in GraphML-Time are required to be unions of time intervals where the intervals can have length zero (i.e., are identical to points), can be bounded on both sides, or unbounded to the left or to the right (unbounded on both sides is the default and does not need any time attributes). Without recurrence it is only possible to specify finite unions of intervals. With recurrences it is possible to specify some infinite unions of intervals or to specify some finite unions in a more compact way.

Time attributes can be attached to any GraphML element. Technically, this is achieved by extending the attribute group common.extra.attrib (which is defined in the file graphml-structure.xsd) in the following way:

 <xs:redefine schemaLocation="http://graphml.graphdrawing.org/xmlns/1.2rc/graphml-structure.xsd">
   <xs:attributeGroup name="common.extra.attrib">
     <xs:attributeGroup ref="common.extra.attrib"/>
     <xs:attributeGroup ref="time.attributes"/>
   </xs:attributeGroup>
 </xs:redefine>

The attribute group time.attributes contains all attributes defined in this section.


Attributes to clarify the representation of lifetimes

Attributes time.point.type and time.duration.type explicitly denote the data types for time points or durations.

 <xs:attribute name="time.point.type" type="time.point.type.simpleType" use="optional"/>
 <xs:attribute name="time.duration.type" type="time.duration.type.simpleType" use="optional"/>

The referred simple types are enumerations of specific strings which are the names of the data types. For instance,

 <xs:simpleType name="time.point.type.simpleType" final="#all">
   <xs:restriction base="xs:string">  
     <xs:enumeration value="decimal"/>
     <xs:enumeration value="integer"/>
     ...
     <xs:enumeration value="dateTime"/>
     ...
     <xs:enumeration value="string"/> 
   </xs:restriction>
 </xs:simpleType>

and similar for durations.

If time points or durations are represented by user-defined strings, the GraphML document should specify a pattern (such as "yyyy-MM-dd'T'HH:mm:ssXXX")

 <xs:attribute name="time.point.pattern" type="xs:string" use="optional"/>
 <xs:attribute name="time.duration.pattern" type="xs:string" use="optional"/>

These attributes default (as everything else) to lower elements in the document tree unless overwritten.

Lifetimes without recurrence

Intervals of finite positive lenght

An interval can be specified in two ways: by a start time and end time or by a start time and a duration (the interval length). Thus, the time attributes can include the following three attributes.

 <xs:attribute name="time.interval.start" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.interval.end" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.interval.length" type="time.duration.simpleType" use="optional"/>

If no other time attributes are given, these attributes specify as lifetime the interval from the start time to either the end time or to the start time plus the duration (in case of calendar time adding durations to time is specified by the XML Schema specification; in the case of numeric times it is obvious). By default the time point specified as the start time is included in the lifetime and the time point specified by the end time or by the start time plus the duration is excluded. This can be overwritten (see below).

Durations can be negative. In this case the interval is closed to the right and open to the left.

To specify the interval as the lifetime, must be later than . Otherwise (if is earlier than ), the two attributes specify two non-overlapping semi-unbounded intervals (see below).

Durations equal to zero and end times equal to start times are disrecommended since intervals of zero lenght should be specified by the time.point attribute (see below).

Use of time.interval.end and time.interval.length in the same element is disrecommended; parsers might choose which of these elements they choose.

Time points (intervals of lenght zero)

To spefify a time point as the lifetime, use the following attribute

 <xs:attribute name="time.point" type="time.point.simpleType" use="optional"/>

Use of time point together with a time interval (for instance, specified by start and end times) is possible; the lifetime is then the union of a point with an interval.

Semi-unbounded intervals

If a start time is not followed by an end time (later), the lifetime is the semi-unbounded interval starting at the start time (inclusive by default) and ending at plus infinity (exclusive).

If an end time is not preceeded by a start time (earlier), the lifetime is the semi-unbounded interval starting at minus infinity (exclusive) and ending at the end time (exclusive by default).

Including / excluding interval borders

Defaults for including or excluding interval borders can be overwritten by the attributes

 <xs:attribute name="time.left.inclusive" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.right.inclusive" type="xs:boolean" use="optional"/>

Infinity or minus infinity is never included.

Unions of intervals or time points

It is possible to store lists of time points and intervals in the following attributes.

 <xs:attribute name="time.points" type="time.points.simpleType" use="optional"/>
 <xs:attribute name="time.intervals.start" type="time.points.simpleType" use="optional"/>
 <xs:attribute name="time.intervals.end" type="time.points.simpleType" use="optional"/>
 <xs:attribute name="time.intervals.length" type="time.durations.simpleType" use="optional"/>

The lifetime of the corresponding element is then the union of these points and/or intervals.

The same comments as when declaring just one interval or point apply. Additionally:

The lists time.intervals.start and time.intervals.end, respectively time.intervals.length, must have the same length and the list items are matched by order.

It is possible to declare additionally to a list of intervals one or two more by use of time.interval.start and/or time.interval.end. This is only recommended for declaring a union of one semi-unbounded interval with other (bounded) intervals.

It is recommended (but not required) that the intervals are non-overlapping and in ascending order.

Recurrent lifetimes

Recurrence allows to specify that lifetimes (defined by the attributes above) repeat indefinitly or a specified number of times or until an end date/time.

Recurrence for equidistant intervals

Uses (some of) the following attributes.

 <xs:attribute name="time.recur" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.recur.left" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.recur.right" type="xs:boolean" use="optional"/>
 <xs:attribute name="time.recur.interval" type="time.duration.simpleType" use="optional"/>
 <xs:attribute name="time.recur.count" type="xs:positiveInteger" use="optional"/>
 <xs:attribute name="time.recur.until" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.recur.since" type="time.point.simpleType" use="optional"/>

The boolean attributes define whether there is recurrence at all: time.recur means recurrence to the left and to the right (past and future); time.recur.left means that recurrence starts in the past and extends to the lifetime that is explicitly defined; time.recur.right from the explicit lifetime into the future.

The duration after which the lifetime is repeated is given in time.recur.interval; this must be specified if there is recurrence at all. If neither count nor since nor until is given then the lifetime repeats indefinitly (to the left, to the right, or to both sides).

The number of repetitions can be bounded by time.recur.count. If the recurrence is to both sides and a count is given, then the lifetime is repeated count-many times starting from the explicit lifetime to the right. The attribute time.recur.since defines the start of the recurrence (inclusively). If since is specified: for left recurrence, the lifetime is repeated no further than to the explicit lifetime and not more often than count many times; for right recurrence, since has no effect; for recurrence to both sides, the lifetime is repeated no further than to the until time and not more often than count many times (indefinitelly if not restrictions are imposed). The attribute time.recur.until marks an end for the recurrence; meaning similar to since. Use of count, since, and until is an overspecification (but can also be treated by taking the lower number of repetitions).

Exception in a recurrence

Similar to recurrence - just defines when the lifetime is not repeated. Use attributes

 <xs:attribute name="time.exrecur.interval" type="time.duration.simpleType" use="optional"/>

etc.

Other

Time offset and time unit defines a mapping from numeric time to a calendar timeline.

 <xs:attribute name="time.offset" type="time.point.simpleType" use="optional"/>
 <xs:attribute name="time.unit" type="time.duration.simpleType" use="optional"/>

Time is explicit if the lifetime of every element can be determined only from the attributes of this element (without having to look at any other element). This may be declared in the GraphML document to help parsers.

 <xs:attribute name="time.explicit" type="xs:boolean" use="optional"/>

Panel-data example

The following example encodes a graph that could arise from a panel-design where a network has been observed at four distinct points in time.

 <?xml version="1.0" encoding="UTF-8"?>
 <graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" 
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml 
                              http://graphml.graphdrawing.org/xmlns/1.2/graphml.xsd"
          xmlns:xs="http://www.w3.org/2001/XMLSchema"   
          time.point.type="xs:int"
          time.duration.type="xs:int">
   <key attr.description="five-point scale on smoking" attr.name="smokes" attr.type="int" for="node" id="d0"/>
   <graph edgedefault="directed" id="G" time.interval.start="1" time.interval.length="4">
     <node id="n0" time.interval.start="1" time.interval.length="3">
       1
       3
       2
     </node>
     <node id="n1">
       4
       0
     </node>
     <edge id="e0" source="n0" target="n1" time.interval.start="2" time.interval.length="2"/>
   </graph>
 </graphml>

Event-data example

 <?xml version="1.0" encoding="UTF-8"?>
 <graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" 
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml 
                              http://graphml.graphdrawing.org/xmlns/1.2/graphml.xsd"
          xmlns:xs="http://www.w3.org/2001/XMLSchema"   
          time.point.type="xs:string"
          time.point.pattern="yyyy-MM-dd'T'HH:mm:ssXXX">
   <graph edgedefault="directed" id="G">
     <node id="n0" time.interval.start="2009-01-09T16:47:07+01:00" />
     <node id="n1" time.interval.start="2009-07-23T01:24:51+01:00" />
     <edge id="e0" source="n0" target="n1" time.point="2009-07-23T01:24:51+01:00"/>
     <edge id="e1" source="n1" target="n0" time.point="2009-07-23T01:47:23+01:00"/>
   </graph>
 </graphml>

Another example

 <?xml version="1.0" encoding="UTF-8"?>
 <graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" 
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml 
                              http://graphml.graphdrawing.org/xmlns/1.2/graphml.xsd"
          xmlns:xs="http://www.w3.org/2001/XMLSchema"   
          time.duration.type="xs:duration"
          time.point.type="xs:string"
          time.point.pattern="yyyy-MM-dd'T'HH:mm:ssXXX">
   <key attr.description="five-point scale on smoking" attr.name="smokes" attr.type="int" for="node" id="d0"/>
   <key attr.description="average frequency of contact per month" attr.name="tie weight" attr.type="double" for="edge" id="d1"/>
   <key attr.description="work in the same company" attr.name="co-worker" attr.type="boolean" for="edge" id="d2"/>
   <graph edgedefault="directed" id="G" >
     <node id="n0" time.interval.start="2009-01-09T16:47:07+01:00" time.interval.length="P2Y7M20DT12H53M22.34S">
       1
       3
     </node>
     <node id="n1">
       0
       4
     </node>
     <edge id="e0" source="n0" target="n1">
       3.1
       false
       true
     </edge>
   </graph>
 </graphml>