Вы находитесь на странице: 1из 3

Cognitive Models of Language and Beyond

Assignments Week 1: PCFGs


Hielke Prins, 6359973

Answers

1. Three possible interpretations:

(a) (b) (c)

2. (a) I personally prefer the meaning 'List the sales of products in 2010' (ie. Give a list of the
products we carried in 2010 and their sales) that I think I described using the third tree in
Question 1.

(b) Rules, frequency and probability (amongst rules starting with the same category):

S → V + NP 2 2/3
S → V + NP + PP 1 1/3
NP → NP + PP 5 5/9
NP → DT + N 3 3/9
NP → N 1 1/9
PP → P + NP 1 1/6
PP → P + N 5 5/6

V → List 3 1
DT → the 3 1
N → sales 3 1/3
N → products 3 1/3
N → 2010 3 1/3
P → in 3 1/2
P → of 3 1/2

3. (a) Selecting S → V + NP + PP out of all rules starting with S: 1/3


Selecting V → List out of all rules starting with V: 1
Selecting NP → NP + PP out of all rules starting with NP: 5/9

1/3
Selecting NP → DT + N out of all rules starting with NP: 3/9
Selecting DT → the out of all rules starting with DT: 1
Selecting N → sales out of all rules starting with N: 1/3
Selecting PP → P + N out of all rules starting with PP: 5/6
Selecting P → of out of all rules starting with P: 1/2
Selecting N → products out of all rules starting with N: 1/3
Selecting PP → P + N out of all rules starting with PP: 5/6
Selecting P → in out of all rules starting with P: 1/2
Selecting N → 2010 out of all rules starting with N: 1/3

1/3 * 1 * 5/9 * 3/9 * 1 * 1/3 * 5/6 * 1/2 * 1/3 * 5/6 * 1/2 * 1/3 125/314928

(b) Selecting S → V + NP out of all rules starting with S: 2/3


Selecting V → List out of all rules starting with V: 1
Selecting NP → NP + PP out of all rules starting with NP: 5/9
Selecting NP → NP + PP out of all rules starting with NP: 5/9
Selecting NP → DT + N out of all rules starting with NP: 3/9
Selecting DT → the out of all rules starting with DT: 1
Selecting N → sales out of all rules starting with N: 1/3
Selecting PP → P + N out of all rules starting with PP: 5/6
Selecting P → of out of all rules starting with P: 1/2
Selecting N → products out of all rules starting with N: 1/3
Selecting PP → P + N out of all rules starting with PP: 5/6
Selecting P → in out of all rules starting with P: 1/2
Selecting N → 2010 out of all rules starting with N: 1/3

2/3 * 1 * 5/9 * 5/9 * 3/9 * 1 * 1/3 * 5/6 * 1/2 * 1/3 * 5/6 * 1/2 * 1/3 625/1417176

(Bold parts mark differences as opposed to the tree in 3a)

(c) Selecting S → V + NP out of all rules starting with S: 2/3


Selecting V → List out of all rules starting with V: 1
Selecting NP → NP + PP out of all rules starting with NP: 5/9
Selecting NP → DT + N out of all rules starting with NP: 3/9
Selecting DT → the out of all rules starting with DT: 1
Selecting N → sales out of all rules starting with N: 1/3
Selecting PP → P + NP out of all rules starting with PP: 5/6
Selecting P → of out of all rules starting with P: 1/2
Selecting NP → NP + PP out of all rules starting with NP: 5/9
Selecting NP → N out of all rules starting with NP: 1/9
Selecting N → products out of all rules starting with N: 1/3
Selecting PP → P + N out of all rules starting with PP: 5/6
Selecting P → in out of all rules starting with P: 1/2
Selecting N → 2010 out of all rules starting with N: 1/3

2/3 * 1 * 5/9 * 3/9 * 1 * 1/3 * 5/6 * 1/2 * 5/9 * 1/9 * 1/3 * 5/6 * 1/2 * 1/3 625/12754584

2/3
When we multiply the odds of generating the first tree by 5 to get the same numerator for all
probabilities (ie. 625/1574640), we'll just have to compare the magnitude of the denominators.
It becomes clear that the PCFG will pick the second tree (1b). My personal preference in fact
ends up being by far the least probable one given the small tree bank depicted in 1a-c.

4. In cognitive terms probabilistic context-free grammar (PCFG) assumes that our brains are
adapted to the way we actually (seem to) use language. Given this assumption the obvious way
to increase adequacy of any PCFG model is to increase the size and representativeness of the
treebank. Adding new trees to the bank will change the frequencies of the rules towards
numbers that reflect actual use of these rules.

Increasing the number of available syntactical or lexical categories might likewise increase the
capacity of PCFG's to reflect daily use. Limitations of the PCFG models however might soon
become a burden. Realistic grammars will have to collect enormous amounts off usage data and
may still have trouble handling exceptions and rare constructions, especially when they are
based on a mix of semantic and grammatical constraints. An interesting improvement might
therefore to include mechanisms to extract rules and lexical items out of patterns in the context.

3/3

Вам также может понравиться