We use for semantic representation language First Order Logic - as presented in the previous lecture. This language provides: the syntax for semantic expressions and an intepreter that can verify whether a semantic formula holds with respect to a specific model (satisfy(Formula, Model)).
We now want to define an algorithm that maps syntactic parse trees to semantic formulas in a compositional manner.
Different implementations of this interface can perform different services. The file tell-ask.scm includes a simple implementation called literal-kb. The literal-kb implementation simply stores the formulae that are asserted through tell, and has no inferencing power - only the facts that are asserted can be retrieved through the ask operation.
The implementation relies on unification: if one queries a formula with free variables, they are unified with the facts found in the knowledge base. In the following example, note the usage of the free variables $x (the $ sign is used to distinguish variables from constants):
> (define lk (make-literal-kb ())) ... > (tell lk '(a 1)) > (tell lk '(a 2)) > (ask-pattern lk '(a $x) '$x) 1 > (ask-patterns lk '(a $x) '$x) (1 2) > (tell lk '(b 1 2)) > (tell lk '(b 2 3)) > (ask-pattern lk '(b $x $y) '($x $y)) (1 2) > (ask-patterns lk '(b $x $y) '($x $y)) ((1 2) (2 3))The unification algorithm is implemented in the unify.scm Scheme file.
To account for the unification mechanism, the knowledge-base interface is extended with 2 methods:
Formulae that can be asserted in a horn-kb are a subset of the first order logic formulae. They are limited to the Horn form: they are definite clauses which is a conjunction of positive terms except for one term (called the head of the clause).
When restricted to such formulae, the resolution algorithm can determine whether a clause is satisfiable given a set of existing clauses. This algorithm is implemented in the function back-chain-each.
With the inferencing possibility, a horn-kb can retrieve data even if they have not been asserted explicitly, but can be infered from asserted facts. For example, even though no fact (q 1) was asserted, the fact (q 1) can be retrieved, because it can be infered from (p 1) and (=> (p 1) (q 1)):
> (define hk (init-horn-kb)) > (tell hk '(forall (x) (=> (p x) (q x)))) > (tell hk '(p 1)) > (ask-pattern hk '(q $x) '$x) 1The resulting system is functionally similar to a Prolog interpreter (just much less efficient).
Definite Clause Grammars (DCG) are an encoding of Context Free Grammars on top of a horn clause interpreter. They were invented together with the Prolog programming language.
The first observation of the method is that a CFG rule such as:
S --> NP VPcan be interpreted as follows:
NP(l1) and VP(l2) and append(l1, l2, l3) => S(l3)That is: in order to prove that a list of words l3 can be parsed as an S non-terminal, one must find two lists l1 and l2 such that l1 and l2 appended together yield l3, and l1 can be parsed as an NP, and l2 can be parsed as a VP.
To ease the parsing, we introduce a special encoding of lists that has 2 properties:
(append (l1 l2) (l2 l3) (l1 l3))This can be read simply as (l1 - l2) + (l2 - l3) = (l1 - l3).
We use this encoding to translate a CFG into a DCG as follows:
Parse with this DCG: can the list [he sleeps] be parsed as an S? (ask k1 '(s (cons he (cons sleeps $x)) $x)) #t
(np s1 s2 sem) (vp s1 s2 sem) (s s1 s2 sem)The rules are extended to manipulate these semantic interpretation. The semantic interpretation is manipulated through unification between the various constituents on the right hand side of rules and the left hand side of the rules. Note: the exact meaning of these unification constraints must be understood well. It is exploited as a restricted form of lambda calculus computation which will be clarified in the next section.
DCG with semantic interpretation ;; NP --> "he" | someone (tell k1 '(np (cons he $r) $r someone)) VP --> "sleeps" | sleep(X) (tell k1 '(vp (cons sleeps $r) $r (sleep $s))) S --> NP(np-sem) VP(vp-sem) | apply(vp-sem np-sem) (tell k1 '(=> (and (np $e1 $e2 $sub) (vp $e2 $e3 ($pred $sub))) (s $e1 $e3 ($pred $sub)))) [(cons he (cons sleeps $x)) $x ] is the difference-list encoding of [he sleeps] (ask-pattern k1 '(s (cons he (cons sleeps $x)) $x $sem) '$sem) (sleep someone) Use the DCG to enumerate sentences (generate) (ask-pattern k1 '(s $e $x (sleep someone)) '$e) (cons he (cons sleeps $r.7993))This grammar is available in the dcg-horn.scm Scheme file. To load all the required Scheme code at once in SISC, load the load.scm Scheme file.
(parse '(Every man that runs loves a woman))At the syntactic level, only very simple relative clauses are supported: those where the relative pronoun is bound to the subject of the relative clause. Thus this grammar cannot parse this sentence:
Every man that Alice loves __ runs.When we introduce quantifiers, such as "a" or "every", the semantic representation of the sentences includes logical quantifiers such as forall and exists. When a sentence includes several quantifiers, they generally can be scoped in various manners. Consider the classical examples:
Every man loves a woman. Every US citizen has a president.The default interpretation for the 2 sentences has different quantifier scopes:
(forall (?m) (=> (man ?m) (exists (?w) (and (woman ?w) (loves ?m ?w))))) (exists (?p) (and (president ?p) (forall (?c) (=> (citizen ?c US) (president-of ?c ?p)))))In this grammar, we will generate a semantic interpretation that leaves the quantifier scope undefined. The scoping will be computed later by using additional information by a post-processor.
The structure of the semantic interpretation is therefore NOT the traditional FOL syntax. Instead, we will have new "special forms" for quantifier placeholders. This is the under-specified semantic representation we will computed for the 2 sentences above:
(and (all ?m (man ?m)) (ex ?w (woman ?w)) (loves ?m ?w)) (and (all ?c (citizen ?c US)) (ex ?p (president ?p)) (president-of ?c ?p))Note that at this level, the 2 semantic expressions have the same structure. A post-processing operation will generate the expected FOL from these structures. Note also that all and ex are NOT the same as forall and exists.
(rule Head --> Goal1 ...)This notation is then translated into the horn clause encoding of DCGs by adding systematically 2 parameters for the difference list covered by each constituent. The code performing this transformation is provided in this dcg-rule Scheme file.
The man that Mary loves __ likes Annie.We indicate in this sentence the position of the gap with the __ notation. The coverage of relative clauses as (that VP) that we used in the previous section is clearly insufficient to handle such constructs. The last grammar in dcgs.scm presents a grammar that uses the Gap construct in NP, VP and S constructs.