Semantic Model

We are interested in modeling the behavior of an intelligent device. The device interacts with its users in natural language, and performs actions as a reaction of this interaction.

As an example, consider an ``intelligent MP3 music player''. It reacts to speech commands of the form ``play tracks 1 to 5''.

Semantic interpretation consists of translating natural language into useful semantic representation which can serve as the basis of intelligent behavior. The requirements on the semantic representation are:

It should be un-ambiguous
It should be possible to define a single "normal" form for any content (that is, it should be possible to reduce any expression to an equivalent expression in normal form). In other words, it should be possible to eliminate variability.
It should lead to easy computation

One of the key computation required on semantic representation is the incremental construction of a semantic model. The model encodes all the information known about a situation up to a given point. This information is the basis for the behavior of the device.

A model is logically consistent if it does not contain logical contradictions. For example, one cannot state in the same model that the music device contains currently 5 songs and also contains 10 songs.

As a consequence, one of the requirements on the semantic representation is that it should be capable of detecting such contradictions. The device should refuse to update its model if the new information it processes contradicts existing information.

This requires the possibility to perform inference on the basis of the knowledge currently encoded in the model. Semantic interpretation is, therefore, the operation of translating a natural language utterance into data that can be merged into an existing semantic model. It will fail if the utterance cannot be understood, or if the understanding of the utterance contradicts existing data in the model.

Model and Task

The model can be selected to be more or less detailled. A detailed semantic model could describe the exact physical behavior of the device -- down to the level of electronic and electric flow in its computing circuits. A less abstract model could describe the behavior in terms of meaningful actions and state transitions in the device -- as far as they influence its behavior as it is perceived by human users. The selection of the level of abstraction of a model depends on the task we expect the device to fulfil.

Implicit Data and Inference

One of the characteristics of natural language -- beyond the fact that it is ambiguous -- is that it leaves much information implicit -- that is, information is conveyed without being stated explicitly in words.

There are several forms of implicitness in natural language -- in general, implicitness contributes to the efficiency of natural language in human communication. Grice proposed a classification in these terms:

Contents:
- said (explicit)
- implicated (implicit)
  - Conventional
  - Non-conventional
    - Conversational
      - Generalized
      - Specific
    - Non-conversational

See Grice, H.P., Further Notes on Logic and Conversation, in Cole, P. Syntax and Semantics, Vol.9: Pragmatics, Academic Press, NY, 1978, pp.113-127. (available online) and the Wikipedia entry on Paul Grice for a presentation of this analysis of implicit content in natural language.

Grice introduced a set of cooperative principles which help users produce and interpret implicit content, so that to make communication efficient. Example: Consider a sophisticated music device that can play and record. A command to the device might be: Play track 3 and record it as "Bob's Best No.1".

What information is not stated explicitly in this command? The temporal aspect is left unspecified ("play as soon as possible"). If the device is currently playing another track when this command is processed, what should be done? Interrupt the current track and start the new track, or wait for the previous track to complete? When should the recording start? We expect the recording to start exactly together with the playing of track 3 -- but this is not stated explicitly. Should the playing of track 3 be performed in such a way that we can listen to it (real time playout) or should it be performed as fast as possible to enable recording (which can be done faster than realtime?).

One way to look at implicit content, is to apply interpretative strategies to understand why a specific content was uttered at a given time and place. The result of this computation (inference process) is to derive possible continuations to an utterance. Example: The dog is barking.
Possible continuations:

The dog is barking / Someone is on the road
The dog is barking / You should go check what is wrong
The dog is barking / I told you this dog is upset these days

From a computational perspective, the issues are: what knowledge is involved to activate these possible continuations? what processes are applied to select relevant continuations from non-relevant ones?

Knowledge Representation

A Knowledge Representation (KR) system provides a formalism and a computation system to implement services to construct and manipulate such models. The requirements on a KR system are:

Decidability - it should be decidable (an algorithm exists that can complete in finite time) to query the KR system.
Tractability - the complexity of the decision should be low so that querying is feasible with limited resources.
Uncertainty - it should be possible to express uncertainty and variable degrees of belief in facts.
Monotonicity - once facts are asserted into a model, they remain true, unless they are explicitly retracted.
Consistency - when facts that are asserted contradict existing facts or one of their possible inferences, the assertion should fail.
Expressiveness - it should be possible to express a wide range of facts and relations among them.
Completeness - all possible inferences of the facts asserted in the model should be reachable when queried.

We will focus on logic-based knowledge representation in the following, using First-Order Logic (FOL) as a basis.

Compositional Semantic Analysis

A basic computational method to perform semantic analysis of isolated sentences highlights the importance of compositionality.

Parsing with Semantics

The important concept is to use compositional semantic analysis: the semantic interpretation of a node in the parse tree is obtained as a function of the semantic interpretation of its daughter constituents.

The following toy semantic interpreter implements this idea: Norvig's Scheme implementation of a bottom-up parser with compositional semantics.

This interpreter includes the same bottom-up parser discussed in the lecture on syntactic parsing. The addition consists of computing a semantic interpretation based on the syntactic parse tree of a sentence.

This is illustrated in the following simple grammar to handle commands given to a CD-player machine of the form "Play songs 1 to 5 without 3".


(define *grammar6*
  '((NP -> (NP CONJ NP) infix-funcall)
    (NP -> (N)          list)
    (NP -> (N P N)      infix-funcall)
    (N ->  (DIGIT)      identity)
    (N ->  (N DIGIT)    10*N+D)
    (P ->  to           integers)
    (CONJ -> and        union*)
    (CONJ -> without    set-diff)
    (DIGIT -> 1 1) (DIGIT -> 2 2) (DIGIT -> 3 3)
    (DIGIT -> 4 4) (DIGIT -> 5 5) (DIGIT -> 6 6)
    (DIGIT -> 7 7) (DIGIT -> 8 8) (DIGIT -> 9 9)
    (DIGIT -> 0 0)))

(define (infix-funcall arg1 function arg2)
    (function arg1 arg2))

(define (union* x y)
    (if (null? (intersection x y))
        (append x y)
        '()))

(define (intersection s1 s2)
    (cond ((null? s1) '())
          ((null? s2) '())
          ((memv (car s1) s2) (cons (car s1) (intersection (cdr s1) s2)))
          (else (intersection (cdr s1) s2))))

(define (set-difference s1 s2)
    (cond  ((null? s1) '())
           ((memv (car s1) s2) (set-difference (cdr s1) s2))
           (else (cons (car s1) (set-difference (cdr s1) s2)))))

(define (subset? s1 s2)
    (cond ((null? s1) #t)
          (else (and (memv (car s1) s2)
                     (subset? (cdr s1) s2)))))

(define (set-diff x y) (if (subset? y x) (set-difference x y) '()))

(define (10*N+D N D) (+ (* 10 N) D))

(define (identity x) x)

Here the semantics of an NP constituent "1 to 5" is a list of integers. The semantic interpretation for complex NPs is obtained by applying the function attached to conjunctions to the semantic interpretations of the NPs they conjoin.


(use *grammar5*)
;;;; 17

(meanings '(1 to 5 without 3))
;;;; ((1 2 4 5))

(meanings '(1 to 4 and 7 to 9))
;;;; ((1 2 3 4 7 8 9))

(meanings '(1 to 6 without 3 and 4))
;;;; ((1 2 4 5 6) (1 2 5 6))

(use *grammar6*)
;;;; 18

(meanings '(1 to 6 without 3 and 4))
;;;; ((1 2 5 6))

(meanings '(1 and 3 to 7 and 9 without 5 and 6))
;;;; ((1 3 4 7 9))

(meanings '(1 and 3 to 7 and 9 without 5 and 2))
;;;; ((1 3 4 6 7 9 2))

(meanings '(1 9 8 to 2 0 1))
;;;; ((198 199 200 201))

(meanings '(1 2 3))
;;;; (123 (123))

This example illustrates on a simple domain (list of numbers) the benefits of an unambiguous and normalizable semantic representation and the process of compositional semantic analysis.

We will next turn to a more general semantic representation, based on first-order logic, that fulfils many of the desirable requirements of a semantic representation.

Last modified Jan 06th, 2013