Using SURGE

Natural Language Generation (201-1-2971-1)
Notes Week 5 - Spring 1996 - Michael Elhadad

Using SURGE

Motivation
Structure of the grammar
The transitivity system
Composite processes
The mood system
Adjuncts and Disjuncts

Motivation

Consider the following problem: you are given two strings, "S1: John eats" and "S2: John sleeps" and you are asked to build a correct English sentence that combines the two events into one sentence expressing that S2 occurred after S1 in the past (something like: "S3: After eating, John slept").
Working on the strings, such transformations are very difficult to achieve. One need to know the syntactic structure of the small sentences in order to be able to combine them appropriately. The input to SURGE is a good representation level to allow such manipulations. Instead of starting with the strings S1 and S2, we will start with the Functional Descriptions (FDs) I1 and I2 describing S1 and S2 respectively.
The second motivation for using a syntactic realization module like SURGE is to provide an interface between a lexical chooser and the grammar within a complete generation system. The responsibility of the realization module is to abstract away from the complexity of the syntax and to present a simple and compositional interface to the lexical chooser.
Our main goal in this tutorial, is to learn how to write inputs for sentences in SURGE. Inputs to surge look like the following example:

(def-test give1
  "John does not often give it to Mary."
  ((cat clause)
   (adverb ((lex "often")))
   (polarity negative)
   (process ((type composite)
	     (relation-type possessive)
	     (lex "give")))
   (participants ((agent ((cat proper) (lex "John")))
                  (affected ((cat proper) (lex "Mary")))
                  (possessor {^ affected})
                  (possessed ((cat pronoun)))))))

The theory of grammar implemented in SURGE provides a definition for terms like "clause", "process", "participants", "agent", "possessor" etc. The following notes give a highlight of this theory.

Structure of the grammar

SURGE supports the following parts of speech:

Clause (cat clause)
Nominal group (cat np)
Verb group (cat verb-group)
Adjectival group (cat ap)
Prepositional phrase (cat pp)
Adverbs (cat adv)

It is at the toplevel an alternation of subgrammars (one for each part-of-speech). Each subgrammar is in turn subdivided into a set of "systems" which are the main decision points of the subgrammar. Every constituent in a SURGE input must have a well-specified cat feature, or one that can be inferred from its position within a higher-level constituent. For example, in the following FD:

(def-test t1 
  "This car is expensive."
  ((cat clause)
   (process ((type ascriptive)))
   (participants ((carrier ((lex "car")
		            (cat common)
		            (distance near)))
                   (attribute ((lex "expensive")))))))

The cat of the participant attribute is not specified, because by default, attributes are adjectival phrases - so it can be inferred from its position.
The grammar for the clause category is the most complex. It consists of four main systems:

Transitivity system: determines the type of the main process and its participants.
Mood system: determines whether the clause is finite (declarative, interrogative or relative) or non-finite (imperative, infinitive, participial).
Voice system: active, passive, causative etc.
Circumstantials: determines the structure of modifiers to the predicate and to the clause as a whole.

According to systemic theory, a clause can be viewed as realizing several layers of meaning into a single linguistic constituent. The most important way to classify these layers of meaning is by referring to the three meta-functions that language satisfies:

Ideational: language as a representation of the world.
Inter-personal: language as a social event (exchange among speaker and hearer).
Textual: language as an element of an extended discourse, information as text in a linear context.

Each function of the clause belongs to one of these 3 meta-functions. For example, the transitivity system belongs to the ideational meta-function, mood to the inter-personal, and voice to the textual. This explains why each function can be studied independently of the other - as each system is largely orthogonal to the others. Eventually, though, all the decisions taken on the clause must be combined into one coherent linguistic structure. This is the point where unification plays a crucial role, in allowing the grammar-writer to combine decisions from several orthogonal systems in a most natural way, through the values shared by a set of attributes. One way to view the decision process going on inside the grammar, is that each system posts constraints on the value of a set of attributes and the unification mechanism finds a combined set of values that satisfies all these constraints all at once.

The transitivity system

The transitivity system determines what participants contribute to the meaning of the clause - when the clause is viewed as a description of an event or relation in the world. At its heart, the clause is a the description of a process - a generic term that can refer to either an event or a relation (and has no relation to the aspect of the clause as in process vs. event vs. state). Participants surface as linguistic constituents that satisfy the following linguistic criteria:

They can surface as subject in one syntactic alternation of the clause. For example:

John gives Mary a book.
Mary is given a book by John.
A book is given to Mary by John.
This indicates that John, Mary, and a book are participants in the clauses.
They cannot be moved around in the clause without affecting the other constituents. For example:

John eats a pie.
* John a pie eats.
? A pie John eats. (would need a comma after pie).
Whereas for non-participants, moving is easier:

John eats a pie on the sofa.
On the sofa John eats a pie.
They cannot be omitted from the clause: For example,

John uses a car to travel.
*John uses to travel.
John uses a car.

NOTE: Each one of these criteria taken alone is not sufficient to characterize participants, but taken together they have proven quite reliable.
Semantically, participants correspond to the nuclear roles of the process. In knowledge representation terms, a process is a relation among terms. The participants are the terms that fill the basic arity of the relation. Additional terms can then be added compositionally to modify the meaning of the process or of the predicate (these correspond to sentence and predicate adjuncts as explained below.

Composite processes

The Mood system

Adjuncts and Disjuncts

Last modified April 17th, 1996 Michael Elhadad