Parse Tree: Difference between revisions

From visone manual
Jump to navigation Jump to search
No edit summary
No edit summary
 
Line 15: Line 15:
Because Sentence is split into NP and VP, which are then further specialized into Noun and Verb respectively, it makes sense to draw this derivation as a tree where each grammatical entity is connected to the entity it is derived from. The parse tree for the example sentence ''John sleeps'' would thus be:
Because Sentence is split into NP and VP, which are then further specialized into Noun and Verb respectively, it makes sense to draw this derivation as a tree where each grammatical entity is connected to the entity it is derived from. The parse tree for the example sentence ''John sleeps'' would thus be:


      Sentence
  Sentence
      /    \
    /    \
    NP      VP
  NP      VP
      |      |
  |      |
    Noun    Verb
Noun    Verb
      |      |
  |      |
    ''John''  ''sleeps''
''John''  ''sleeps''


As a more interesting example, let us consider the sentence ''Tom eats the mouse'' (with Tom being a cat).
As a more interesting example, let us consider the sentence ''Sammy eats a mouse'' (Sammy being our university's resident cat).
Here, the second rule is used, decomposing a Sentence into an NP (the subject), a VP (the predicate) and another NP (the object).
Here, the second rule is used, decomposing a Sentence into an NP (the subject), a VP (the predicate) and another NP (the object).
Clearly, both the first NP and the VP again simply derive a noun (''Tom'') and a verb (''eats'') using the third and fifth rule respectively,
Clearly, both the first NP and the VP again simply derive a noun (''Sammy'') and a verb (''eats'') using the third and fifth rule respectively,
however the second NP uses the fourth rule to further split into a determiner (''the'') and a noun (''mouse'').
however the second NP uses the fourth rule to further split into a determiner (''the'') and a noun (''mouse'').
The parse tree thus shows the second NP decomposing into DT and Noun:
The parse tree thus shows the second NP decomposing into DT and Noun:


        Sentence
      Sentence
      /  |      \
    /  |      \
    NP    VP      NP
  NP    VP      NP
    |    |      /  \
  |    |      /  \
  Noun  Verb  DT  Noun
Noun  Verb  DT  Noun
    |    |    |    |
  |    |    |    |
    ''Tom''  ''eats'' ''the'' ''mouse''
''Sammy'' ''eats''  ''a''  ''mouse''
 
Some systems, such as the famous [http://nlp.stanford.edu/software/lex-parser.shtml Stanford Parser], instead show this tree sideways, that is, the output would look like shown below. Note that the structure is preserved, only its presentation is different and more compact; the above tree would look approximately like this:
 
  (Sentence
  (NP (Noun ''Sammy''))
  (VP (Verb ''eats''))
  (NP (DT ''the'')
      (Noun ''mouse'')))
 
Using a much more sophisticated grammar, the Stanford Parser can almost perfectly describe arbitrary English text using such a tree. The Stanford-provided example sentence ''My dog also likes eating bananas'' is represented by
 
(ROOT
  (S
    (NP (PRP$ My) (NN dog))
    (ADVP (RB also))
    (VP (VBZ likes)
      (S
        (VP (VBG eating)
          (S
            (ADJP (NNS bananas))))))
    (. .)))

Latest revision as of 12:49, 16 November 2010

A parse tree represents the structural construction of a sentence with respect to the grammar of the language in question.

For example, we could construct the following toy grammar for the English language using the subject-predicate-object structure.

Sentence ::= NP VP
           | NP VP NP
NP ::= Noun
     | DT Noun
VP ::= Verb

Here we use the conventional names of NP for Noun Phrases, VP for Verb Phrases and DT for Determiners like the or an.

Using this grammar, we can describe sentences like John sleeps or the dog eats the cake. For John sleeps, we can use the first rule, which states that a Sentence can be an NP followed by a VP. Using the third rule, an NP can be just a noun, such as John, and similarly, according to the fifth rule, a VP can consist of simply a verb, such as sleeps. Because Sentence is split into NP and VP, which are then further specialized into Noun and Verb respectively, it makes sense to draw this derivation as a tree where each grammatical entity is connected to the entity it is derived from. The parse tree for the example sentence John sleeps would thus be:

  Sentence
   /    \
 NP      VP
  |      |
Noun    Verb
  |      |
John   sleeps

As a more interesting example, let us consider the sentence Sammy eats a mouse (Sammy being our university's resident cat). Here, the second rule is used, decomposing a Sentence into an NP (the subject), a VP (the predicate) and another NP (the object). Clearly, both the first NP and the VP again simply derive a noun (Sammy) and a verb (eats) using the third and fifth rule respectively, however the second NP uses the fourth rule to further split into a determiner (the) and a noun (mouse). The parse tree thus shows the second NP decomposing into DT and Noun:

     Sentence
   /   |      \
 NP    VP      NP
  |    |      /  \
Noun  Verb  DT   Noun
  |    |     |    |
Sammy eats   a   mouse

Some systems, such as the famous Stanford Parser, instead show this tree sideways, that is, the output would look like shown below. Note that the structure is preserved, only its presentation is different and more compact; the above tree would look approximately like this:

(Sentence
  (NP (Noun Sammy))
  (VP (Verb eats))
  (NP (DT the)
      (Noun mouse)))

Using a much more sophisticated grammar, the Stanford Parser can almost perfectly describe arbitrary English text using such a tree. The Stanford-provided example sentence My dog also likes eating bananas is represented by

(ROOT
  (S
    (NP (PRP$ My) (NN dog))
    (ADVP (RB also))
    (VP (VBZ likes)
      (S
        (VP (VBG eating)
          (S
            (ADJP (NNS bananas))))))
    (. .)))