Parse Tree: Difference between revisions

From visone manual
Jump to navigation Jump to search
(Created page with 'A '''parse tree''' represents the structural construction of a sentence with respect to the grammar of the language in question. For example, we could construct a toy grammar fo…')
 
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
A '''parse tree''' represents the structural construction of a sentence with respect to the grammar of the language in question.
A '''parse tree''' represents the structural construction of a sentence with respect to the grammar of the language in question.


For example, we could construct a toy grammar for the English language using the subject-predicate-object structure, as
For example, we could construct the following toy grammar for the English language using the subject-predicate-object structure.


  Sentence ::= NP VP
  Sentence ::= NP VP
Line 9: Line 9:
  VP ::= Verb
  VP ::= Verb


using the conventional names of '''NP''' for ''Noun Phrases'', '''VP''' for ''Verb Phrases'' and '''DT''' for ''Determiners'' like ''the'' or ''an''.
Here we use the conventional names of '''NP''' for ''Noun Phrases'', '''VP''' for ''Verb Phrases'' and '''DT''' for ''Determiners'' like ''the'' or ''an''.
 
Using this grammar, we can describe sentences like ''John sleeps'' or ''the dog eats the cake''.
Using this grammar, we can describe sentences like ''John sleeps'' or ''the dog eats the cake''.
For ''John sleeps'', we can use the first rule, which states that a sentence can be an NP followed by a VP. Using the third rule, an NP can be just a noun, such as ''John'', and similarly, according to the fifth rule, a VP can consist of simply a verb, such as ''sleeps''.
For ''John sleeps'', we can use the first rule, which states that a Sentence can be an NP followed by a VP. Using the third rule, an NP can be just a noun, such as ''John'', and similarly, according to the fifth rule, a VP can consist of simply a verb, such as ''sleeps''.
Because Sentence is split into NP and VP, which are then further specialized into Noun and Verb respectively, it makes sense to draw this derivation as a tree where each grammatical entity is connected to the entity it is derived from. The parse tree for the example sentence ''John sleeps'' would thus be:
 
  Sentence
    /    \
  NP      VP
  |      |
Noun    Verb
  |      |
''John''  ''sleeps''
 
As a more interesting example, let us consider the sentence ''Sammy eats a mouse'' (Sammy being our university's resident cat).
Here, the second rule is used, decomposing a Sentence into an NP (the subject), a VP (the predicate) and another NP (the object).
Clearly, both the first NP and the VP again simply derive a noun (''Sammy'') and a verb (''eats'') using the third and fifth rule respectively,
however the second NP uses the fourth rule to further split into a determiner (''the'') and a noun (''mouse'').
The parse tree thus shows the second NP decomposing into DT and Noun:
 
      Sentence
    /  |      \
  NP    VP      NP
  |    |      /  \
Noun  Verb  DT  Noun
  |    |    |    |
''Sammy'' ''eats''  ''a''  ''mouse''
 
Some systems, such as the famous [http://nlp.stanford.edu/software/lex-parser.shtml Stanford Parser], instead show this tree sideways, that is, the output would look like shown below. Note that the structure is preserved, only its presentation is different and more compact; the above tree would look approximately like this:
 
(Sentence
  (NP (Noun ''Sammy''))
  (VP (Verb ''eats''))
  (NP (DT ''the'')
      (Noun ''mouse'')))
 
Using a much more sophisticated grammar, the Stanford Parser can almost perfectly describe arbitrary English text using such a tree. The Stanford-provided example sentence ''My dog also likes eating bananas'' is represented by
 
(ROOT
  (S
    (NP (PRP$ My) (NN dog))
    (ADVP (RB also))
    (VP (VBZ likes)
      (S
        (VP (VBG eating)
          (S
            (ADJP (NNS bananas))))))
    (. .)))

Latest revision as of 12:49, 16 November 2010

A parse tree represents the structural construction of a sentence with respect to the grammar of the language in question.

For example, we could construct the following toy grammar for the English language using the subject-predicate-object structure.

Sentence ::= NP VP
           | NP VP NP
NP ::= Noun
     | DT Noun
VP ::= Verb

Here we use the conventional names of NP for Noun Phrases, VP for Verb Phrases and DT for Determiners like the or an.

Using this grammar, we can describe sentences like John sleeps or the dog eats the cake. For John sleeps, we can use the first rule, which states that a Sentence can be an NP followed by a VP. Using the third rule, an NP can be just a noun, such as John, and similarly, according to the fifth rule, a VP can consist of simply a verb, such as sleeps. Because Sentence is split into NP and VP, which are then further specialized into Noun and Verb respectively, it makes sense to draw this derivation as a tree where each grammatical entity is connected to the entity it is derived from. The parse tree for the example sentence John sleeps would thus be:

  Sentence
   /    \
 NP      VP
  |      |
Noun    Verb
  |      |
John   sleeps

As a more interesting example, let us consider the sentence Sammy eats a mouse (Sammy being our university's resident cat). Here, the second rule is used, decomposing a Sentence into an NP (the subject), a VP (the predicate) and another NP (the object). Clearly, both the first NP and the VP again simply derive a noun (Sammy) and a verb (eats) using the third and fifth rule respectively, however the second NP uses the fourth rule to further split into a determiner (the) and a noun (mouse). The parse tree thus shows the second NP decomposing into DT and Noun:

     Sentence
   /   |      \
 NP    VP      NP
  |    |      /  \
Noun  Verb  DT   Noun
  |    |     |    |
Sammy eats   a   mouse

Some systems, such as the famous Stanford Parser, instead show this tree sideways, that is, the output would look like shown below. Note that the structure is preserved, only its presentation is different and more compact; the above tree would look approximately like this:

(Sentence
  (NP (Noun Sammy))
  (VP (Verb eats))
  (NP (DT the)
      (Noun mouse)))

Using a much more sophisticated grammar, the Stanford Parser can almost perfectly describe arbitrary English text using such a tree. The Stanford-provided example sentence My dog also likes eating bananas is represented by

(ROOT
  (S
    (NP (PRP$ My) (NN dog))
    (ADVP (RB also))
    (VP (VBZ likes)
      (S
        (VP (VBG eating)
          (S
            (ADJP (NNS bananas))))))
    (. .)))