As mentioned previously, the generation of a sentence includes two subprocesses: unification and linearization. Unification produces a complex description of a sentence, made of several constituents. Each constituent is described by an FD, and can recursively contain other subconstituents.
Linearization takes such a complex non-ordered description and outputs a linear, ordered, string of words. This operation is constrained by directives put within the FD. These constraints on the ordering appear after the special attribute pattern.
For example, in a sentence containing the constituents prot, goal and verb, the following pattern can be used:
(PATTERN (PROT VERB GOAL))
|
The constituents correspond to features of the FD describing the sentence. That is, this FD must contain pairs with the attributes prot, verb and goal. For example:
((cat S)
(PROT (...))
(GOAL (...))
(VERB (...))
(PATTERN (PROT VERB GOAL)))
|
If a constituent mentioned in the pattern is not present in the FD, nothing happens: the linearization of an empty (or non existent) constituent is the empty string.
The pattern directives are generally added by the grammar, since the input to the unifier should be a semantic representation and therefore does not contain any constraint on word ordering.
NOTE: Patterns can contain full paths to specify constituents. For example, the following is a legal pattern:
(PATTERN ({prot n} {verb v} goal))
|
The following symbols have a special meaning for the pattern unifier: dots and pound (standing respectively for the notations `...' and `#').
A pattern (c1 ... c2) (noted in the program (c1 dots c2)) indicates that the constituent c1 must precede the constituent c2, but they need not be adjacent. Zero, one or many other constituents can come in between. The pattern (c1 ... c2) still requires the sentence to start with constituent c1 and to end with c2. The pattern (... c1 ... c2 ...) only forces c1 to come before c2.
The pound (#) symbol is used to represent 0 or 1 constituent. For example, if you want to allow a sentence to start with an optional adverbial, you can specify it with the pattern (# prot ... verb ...). This directive will be compatible with both (prot verb goal) and (adverb prot verb goal) for example.
As a consequence of the use of the two symbols pound and dots, the constraints described by pattern directives are PARTIAL orderings.
NOTE: because of the presence of dots and pound, the unification of patterns is a non-deterministic operation. It can produce several results for a given input, and there is no way to predict in which order these possible solutions will be tried. Caution should be exercised when specifying patterns: they should be specific enough to allow only acceptable word orderings (do not use too many dots) but should not be too specific to allow for as yet not supported constituents (for example, a sentence can start with an Adverbial, not necessarily an NP).
The following example illustrates the fact that pattern unification is non-deterministic in general:
Pattern Unification:
p1: (pattern (dots a dots b dots))
p2: (pattern (dots c dots d dots))
|
Patterns are eventually interpreted by the linearization component to produce a string out of an FD.
Appendix
describes some advanced uses of pattern unification.