(defstruct (rule (:type list)) lhs -> rhs) (defstruct (parse) "A parse tree and a remainder." tree rem) ;; Trees are of the form: (lhs . rhs) (defun new-tree (cat rhs) (cons cat rhs)) (defun tree-lhs (tree) (first tree)) (defun tree-rhs (tree) (rest tree))The code computes all possible parses that are acceptable by the grammar. The following examples illustrate its performance:
(use *grammar3*) ;;;; -> 15 (parser '(the table)) ;;;; -> ((NP (ART THE) (NOUN TABLE))) (parser '(the ball hit the table)) ;;;; -> ((SENTENCE (NP (ART THE) (NOUN BALL)) ;;;; (VP (VERB HIT) ;;;; (NP (ART THE) (NOUN TABLE))))) (parser '(the noun took the verb)) ;;;; -> ((SENTENCE (NP (ART THE) (NOUN NOUN)) ;;;; (VP (VERB TOOK) ;;;; (NP (ART THE) (NOUN VERB))))) (use *grammar4*) ;;;; -> 54 (parser '(The man hit the table with the ball)) ;;;; -> ((S (NP (D THE) (N MAN)) ;;;; (VP (VP (V HIT) (NP (D THE) (N TABLE))) ;;;; (PP (P WITH) (NP (D THE) (N BALL))))) ;;;; (S (NP (D THE) (N MAN)) ;;;; (VP (V HIT) ;;;; (NP (NP (D THE) (N TABLE)) ;;;; (PP (P WITH) (NP (D THE) (N BALL))))))) (parser '(the orange saw)) ;;;; -> ((S (NP (D THE) (N ORANGE)) (VP (V SAW))) ;;;; (NP (D THE) (A+ (A ORANGE)) (N SAW)))This version can also recognize unknown words, by hypothesizing that they belong to an open-class pre-terminal syntactic category (N, V, A):
(parser '(John liked Mary)) ;;;; -> ((S (NP (NAME JOHN)) ;;;; (VP (V LIKED) (NP (NAME MARY))))) (parser '(Dana liked Dale)) ;;;; -> ((S (NP (NAME DANA)) ;;;; (VP (V LIKED) (NP (NAME DALE))))) (parser '(the rab zaggled the woogly quax)) ;;;; -> ((S (NP (D THE) (N RAB)) ;;;; (VP (V ZAGGLED) (NP (D THE) (A+ (A WOOGLY)) (N QUAX))))) (parser '(the slithy toves gymbled)) ;;;; -> ((S (NP (D THE) (N SLITHY)) (VP (V TOVES) (NP (NAME GYMBLED)))) ;;;; (S (NP (D THE) (A+ (A SLITHY)) (N TOVES)) (VP (V GYMBLED))) ;;;; (NP (D THE) (A+ (A SLITHY) (A+ (A TOVES))) (N GYMBLED))) (parser '(the slithy toves gymbled on the wabe)) ;;;; -> ((S (NP (D THE) (N SLITHY)) ;;;; (VP (VP (V TOVES) (NP (NAME GYMBLED))) ;;;; (PP (P ON) (NP (D THE) (N WABE))))) ;;;; (S (NP (D THE) (N SLITHY)) ;;;; (VP (V TOVES) (NP (NP (NAME GYMBLED)) ;;;; (PP (P ON) (NP (D THE) (N WABE)))))) ;;;; (S (NP (D THE) (A+ (A SLITHY)) (N TOVES)) ;;;; (VP (VP (V GYMBLED)) (PP (P ON) (NP (D THE) (N WABE))))) ;;;; (NP (NP (D THE) (A+ (A SLITHY) (A+ (A TOVES))) (N GYMBLED)) ;;;; (PP (P ON) (NP (D THE) (N WABE)))))If the recognizer was using morphological clues (e.g., a word ending in "ed" is most likely a verb), it would do a better job.
The only change to the parser to make it accept unknown words is to the following function:
(defparameter *open-categories* '(N V A Name)
"Categories to consider for unknown words")
(defun lexical-rules (word)
"Return a list of rules with word on the right hand side."
(or (find-all word *grammar* :key #'rule-rhs :test #'equal)
(mapcar #'(lambda (cat) `(,cat -> ,word)) *open-categories*)))
General purpose memoization in Common Lisp.
(memoize 'lexical-rules) (memoize 'rules-starting-with) (memoize 'parse :test #'eq) (defun parser (words) "Return all complete parses of a list of words." (clear-memoize 'parse) ;*** Reinitialize the tables (mapcar #'parse-tree (complete-parses (parse words))))
This is illustrated in the following simple grammar to handle commands given to a CD-player machine of the form "Play songs 1 to 5 without 3".
(defparameter *grammar5*
'((NP -> (NP CONJ NP) infix-funcall)
(NP -> (N) list)
(NP -> (N P N) infix-funcall)
(N -> (DIGIT) identity)
(P -> to integers)
(CONJ -> and union)
(CONJ -> without set-difference)
(N -> 1 1) (N -> 2 2) (N -> 3 3) (N -> 4 4) (N -> 5 5)
(N -> 6 6) (N -> 7 7) (N -> 8 8) (N -> 9 9) (N -> 0 0)))
Here the semantics of an NP constituent "1 to 5" is a list of
integers. The semantic interpretation for complex NPs is obtained by
applying the function attached to conjunctions to the semantic
interpretations of the NPs they conjoin.
(use *grammar5*) ;;;; -> 17 (meanings '(1 to 5 without 3)) ;;;; -> ((1 2 4 5)) (meanings '(1 to 4 and 7 to 9)) ;;;; -> ((1 2 3 4 7 8 9)) (meanings '(1 to 6 without 3 and 4)) ;;;; -> ((1 2 4 5 6) (1 2 5 6)) (use *grammar6*) ;;;; -> 18 (meanings '(1 to 6 without 3 and 4)) ;;;; -> ((1 2 5 6)) (meanings '(1 and 3 to 7 and 9 without 5 and 6)) ;;;; -> ((1 3 4 7 9)) (meanings '(1 and 3 to 7 and 9 without 5 and 2)) ;;;; -> ((1 3 4 6 7 9 2)) (meanings '(1 9 8 to 2 0 1)) ;;;; -> ((198 199 200 201)) (meanings '(1 2 3)) ;;;; -> (123 (123))