next up previous contents
Next: FDs as Graphs Up: Precise Characterization of FDs Previous: Precise Characterization of FDs

           
Generalities: Features, Syntax, Paths and Equations

An FD is a list of pairs, called features. The attribute of a feature needs to be a symbol. The value of a feature can be either a leaf or recursively an FD. Here is an example:  

          
(1) ((cat np)
     (det ((cat article)
	   (definite yes)))
     (n   ((cat noun)
	   (number plural))))
          
          

A ``leaf'' is a primitive fd. It can be either a symbol, a number, a string, a character or an array.

A given attribute in an FD must have at most ONE value. Therefore, the FD ((size 1) (size 2)) is illegal. In fact FDs can be viewed as a conjunction of constraints on the description of an object: for an object to be described by ((size 1) (size 2)) it would need to have its property size to have both the values 1 and 2. Conversely, if the attribute size does not appear in the FD, that means its value is not constrained and it can be anything. The FD nil (empty list of pairs) thus represents all the objects in the world. The pair (att nil) expresses the constraint that the value of att can be anything. It is therefore useless, and the FD ((att1 nil) (att2 val2)) is exactly equivalent to the FD ((att2 val2)).        

Any position in an FD can be unambiguously referred to by the ``path'' leading from the top-level of the FD to the value considered. For example, FD (1) can be described by the set of expressions:

          
{cat} = np
{det cat} = article
{det definite} = yes
{n cat} = noun
{n number} = plural
          
          

Paths are represented as simple lists of atoms between curly braces (for example, {det definite}). This notation is not ambiguous because at each level there is at most one feature with a given attribute.  

A path can be ``absolute'' or ``relative.'' An absolute path gives the way from the top-level of the FD down to a value. A relative path starts with the symbol ``^'' (up-arrow). It refers to the FD embedding the current feature. You can have several ``^'' in a row to go up several levels.     For example:

          
((cat s)
 (prot ((cat np)
	(number sing)))
 (verb ((cat vp)
	(number {^ ^ prot number}))))
                     ^
_____________________|
this is refering to the absolute path {prot number}
          
          

The notation {^4 x} is equivalent to {^ ^ ^ ^ x}. It is convenient when dealing with deeply embedded constituents.  

Relative paths are not simply a syntactic convenience, but they extend the expressibility of the formalism, by making grammars ``relocatable''. For example, the grammar for NPs can be unified with a subconstituent of the input FD at different levels ({agent} and {affected} for example). In each case, a feature like (determiner ((number {^ ^ number}))) points to the number of the appropriate constituent. Without relative paths such a general constraint could not be expressed.

The value of a pair can be a path. In that case, it means that the values of the pair pointed to by the path and the value of the current pair must always be the same. The two features are then said to be unified. In the previous example, the features at the paths {verb number} and {prot number} are unified. This means that they are absolutely equivalent, they are two names for the same object (structure sharing). This is equivalent to the systemic operation of ``conflation''.      

In general, an expression of the form x = y, where either x or y is a path or a leaf is called an equation.   An fd can be viewed as a flat list of equations.

In FUF, it is possible to have paths on the left of a pair. It is therefore possible to represent an fd as a list of equations as follows:

          
(({cat} np)
 ({det cat} article)
 ({det definite} yes)
 ({n cat}  noun)
 ({n number}  plural))
          
          

This notation allows to freely mix the ``fds as equations'' view with the ``fds as structure'' one.23

The only case where a given attribute can appear in several pairs is when it is followed by paths in all but one pairs. That is:

          
((a ((a1 v1)))
 (a {b})
 (a {c}))
          
          

is a valid FD. It is equivalent for example to:

          
((b ((a1 v1)))
 (a {b})
 (c {b}))

or to:

((b ((a1 v1))) ({a} {b}) (c {a}))

The function normalize-fd   is convenient to put an FD into its canonical form. For example:

          

(setf fd1 '((a ((a1 v1))) (b ((b1 w1))) (a ((a2 v2))) (b ((b2 ((w2 2))))) (b ((b2 ((w3 3)))))))

LISP> (normalize fd1)

((a ((a1 v1) (a2 v2))) (b ((b1 w1) (b2 ((w2 2) (w3 3))))))

All unification functions assume that the input fd is given in canonical form. normalize-fd is particularly useful when the inputs are produced incrementally by a program. Note that normalize-fd will fail and return *fail* if the input FD is not consistent (for example ((a 1) (a 2))).  


next up previous contents
Next: FDs as Graphs Up: Precise Characterization of FDs Previous: Precise Characterization of FDs
Michael Elhadad - elhadad@cs.bgu.ac.il