Difference between revisions of "CRA"

From visone manual
Jump to navigation Jump to search
Line 10: Line 10:
 
transparent tissue-thin paper, ...
 
transparent tissue-thin paper, ...
 
</blockquote>
 
</blockquote>
 +
 
In the first step, this sentence is parsed to extract the NPs. For each of the words appearing in an NP, a node is created in the graph.
 
In the first step, this sentence is parsed to extract the NPs. For each of the words appearing in an NP, a node is created in the graph.
 
<blockquote>
 
<blockquote>
Line 17: Line 18:
 
</blockquote>
 
</blockquote>
 
http://tqzamf.ath.cx/pkd/cra0.png
 
http://tqzamf.ath.cx/pkd/cra0.png
 +
 +
Next, words that occur within the same NP are connected, regardless of their distance within the NPs. For example, the NP ''transparent tissue-thin paper'' causes the edges between ''transparent'' and ''paper'', ''paper'' and ''tissue-thin'' as well as ''tissue-thin'' and ''transparent''. Had there been five words in the NP, each of them would have been connected with all of the four others.
 +
<blockquote>
 +
<font color="silver">Half an <font color="blue">ancient silver</font> fifty <font color="black">cent piece</font>, several <font color="black">quotations</font> from <font color="black">John
 +
Donne</font>'s sermons written incorrectly, each on a <font color="black">separate piece</font> of
 +
<font color="red">transparent tissue-thin paper</font>, ...</font>
 +
</blockquote>
 +
http://tqzamf.ath.cx/pkd/cra2.png
 +
 +
Words that are not part of the same NP are still connected if one of them is at the end of an NP and the other is at the beginning of the following NP. That is, because ''piece'' is the last word of then NP ''separate piece'' and ''transparent'' if the first word of the NP following it, the two words are connected in the network.
 +
<blockquote>
 +
<font color="silver">Half an <font color="black">ancient silver</font> fifty <font color="black">cent piece</font>, several <font color="red">quotations</font> from <font color="red">John</font>
 +
<font color="black">Donne</font>'s sermons written incorrectly, each on a <font color="black">separate</font> <font color="blue">piece</font> of
 +
<font color="blue">transparent</font> <font color="black">tissue-thin paper</font>, ...</font>
 +
</blockquote>
 +
http://tqzamf.ath.cx/pkd/cra3.png
 +
 +
Finally, duplicate nodes for the same word are merged.
 +
http://tqzamf.ath.cx/pkd/cra4.png
 +
http://tqzamf.ath.cx/pkd/cra5.png

Revision as of 22:57, 25 November 2010

Centering Resonance Analysis (CRA) extracts a network from a text by analysing its centers, for which the Centering Theory states that they contain the main contents of the text. According to Centering Theory, these centers are the Noun Phrases (NPs) of a text, that is the nouns together with any modifiers belonging to them. Thus, words within these centers define the words within the CRA text network, and the way they occur in the text can cause links between them.

Let us consider the following example sentence taken from the short story We Can Remember It for You Wholesale by Philip K. Dick:

Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...

In the first step, this sentence is parsed to extract the NPs. For each of the words appearing in an NP, a node is created in the graph.

Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...

http://tqzamf.ath.cx/pkd/cra0.png

Next, words that occur within the same NP are connected, regardless of their distance within the NPs. For example, the NP transparent tissue-thin paper causes the edges between transparent and paper, paper and tissue-thin as well as tissue-thin and transparent. Had there been five words in the NP, each of them would have been connected with all of the four others.

Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...

http://tqzamf.ath.cx/pkd/cra2.png

Words that are not part of the same NP are still connected if one of them is at the end of an NP and the other is at the beginning of the following NP. That is, because piece is the last word of then NP separate piece and transparent if the first word of the NP following it, the two words are connected in the network.

Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...

http://tqzamf.ath.cx/pkd/cra3.png

Finally, duplicate nodes for the same word are merged. http://tqzamf.ath.cx/pkd/cra4.png http://tqzamf.ath.cx/pkd/cra5.png