CRA: Difference between revisions
No edit summary |
No edit summary |
||
Line 17: | Line 17: | ||
<font color="red">transparent</font> <font color="black">tissue-thin paper</font>, ...</font> | <font color="red">transparent</font> <font color="black">tissue-thin paper</font>, ...</font> | ||
</blockquote> | </blockquote> | ||
[[File:CRA0.png]] | |||
Next, words that occur within the same NP are connected, regardless of their distance within the NPs. For example, the NP ''transparent tissue-thin paper'' causes the edges between ''transparent'' and ''paper'', ''paper'' and ''tissue-thin'' as well as ''tissue-thin'' and ''transparent''. Had there been five words in the NP, each of them would have been connected with all of the four others. | Next, words that occur within the same NP are connected, regardless of their distance within the NPs. For example, the NP ''transparent tissue-thin paper'' causes the edges between ''transparent'' and ''paper'', ''paper'' and ''tissue-thin'' as well as ''tissue-thin'' and ''transparent''. Had there been five words in the NP, each of them would have been connected with all of the four others. | ||
Line 25: | Line 26: | ||
<font color="red">transparent tissue-thin paper</font>, ...</font> | <font color="red">transparent tissue-thin paper</font>, ...</font> | ||
</blockquote> | </blockquote> | ||
[[File:CRA1.png]] | |||
Words that are not part of the same NP are still connected if one of them is at the end of an NP and the other is at the beginning of the following NP. That is, because ''piece'' is the last word of then NP ''separate piece'' and ''transparent'' if the first word of the NP following it, the two words are connected in the network. | Words that are not part of the same NP are still connected if one of them is at the end of an NP and the other is at the beginning of the following NP. That is, because ''piece'' is the last word of then NP ''separate piece'' and ''transparent'' if the first word of the NP following it, the two words are connected in the network. | ||
Line 33: | Line 35: | ||
<font color="blue">transparent</font> <font color="black">tissue-thin paper</font>, ...</font> | <font color="blue">transparent</font> <font color="black">tissue-thin paper</font>, ...</font> | ||
</blockquote> | </blockquote> | ||
Finally, duplicate nodes for the same word are merged. For example, because ''piece'' appears twice, two nodes were created for it | [[File:CRA2.png]] | ||
Finally, duplicate nodes for the same word are merged. For example, because ''piece'' appears twice, two nodes were created for it. | |||
[[File:CRA3.png]] | |||
These two nodes are now merged into just one. | |||
[[File:CRA4.png]] |
Revision as of 14:03, 30 November 2010
Centering Resonance Analysis (CRA) extracts a network from a text by analysing its centers, for which the Centering Theory states that they contain the main contents of the text. According to Centering Theory, these centers are the Noun Phrases (NPs) of a text, that is the nouns together with any modifiers belonging to them. Thus, words within these centers define the words within the CRA text network, and the way they occur in the text can cause links between them.
Let us consider the following example sentence taken from the short story We Can Remember It for You Wholesale by Philip K. Dick:
Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...
In the first step, this sentence is parsed to extract the NPs. For each of the words appearing in an NP, a node is created in the graph.
Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...
Next, words that occur within the same NP are connected, regardless of their distance within the NPs. For example, the NP transparent tissue-thin paper causes the edges between transparent and paper, paper and tissue-thin as well as tissue-thin and transparent. Had there been five words in the NP, each of them would have been connected with all of the four others.
Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...
Words that are not part of the same NP are still connected if one of them is at the end of an NP and the other is at the beginning of the following NP. That is, because piece is the last word of then NP separate piece and transparent if the first word of the NP following it, the two words are connected in the network.
Half an ancient silver fifty cent piece, several quotations from John Donne's sermons written incorrectly, each on a separate piece of transparent tissue-thin paper, ...
Finally, duplicate nodes for the same word are merged. For example, because piece appears twice, two nodes were created for it.
These two nodes are now merged into just one.