Michael Elhadad - NLP Fall 2013

Question Answering with Semantic Interpretation

We will review how we can develop a question-answering system based on semantic analysis of the statements and questions.
  1. Domain
  2. DCG Parsing with semantic interpretation
  3. Putting Things Together: Question/Answering

Domain and Knowledge Acquisition

Consider a system that manages facts about geographic relations. The semantic predicates we are interested in covering are:
  1. Types of individuals: country, region, city, continent
  2. Properties of individuals: population
  3. Relations between individuals: located-in, capital-of, has-border-with
Examples of basic facts we store in the model are:
country(israel)
country(jordan)
region(negev)
city(beersheba)

population(beersheba, 200000)
population(israel, 7000000)
capital-of(negev, beersheba)
capital-of(israel, jerusalem)
has-border-with(israel, jordan)
located-in(israel, negev)
located-in(negev, beersheba)
located-in(negev, yeruham)
In addition, we can store rules in the model, such as:
located-in(?city, ?region) and located-in(?region, ?country) => located-in(?city, ?country)
Such facts can be extracted from many interesting data sources on the Web. For example, the CIA World Factbook contains fresh and reliable information on many countries. This information has been encoded in Prolog facts in the past (CIA world factbook data as Prolog facts by Ronen Feldman and Amir Zilberstein). Examples of semantic information to extract from the CIA pages include:
  • population
  • area
  • border countries (this is a list)
  • coastline
  • capital
  • GDP
  • GDP per capita Additional information can be found in resources such as the TIPSTER Gazetteer data set contains about 240,000 basic World geography facts. For example:
    Beersheba (CITY 3) Southern (PROVINCE) Israel (COUNTRY)Beersheba Springs (CITY) Tennessee (PROVINCE 1) United States (COUNTRY)
    The format of this data set is explained in this document.The dataset itself is a 1.5MB file (13MB uncompressed). The TIPSTER Gazetteer data has served as the basis for the geonames Web service.

    Given these data sources, we can define the vocabulary of our semantic model:

    1. Individuals (it is convenient to list individuals by type)
    2. Predicates of arity 1 and 2.
    We can then define inference rules over these properties to capture generalizations such as the example of located-in given above. Given the first-order inference engine we use (Prolog-like), we must be careful to avoid "circular" rules such as:
    has-border-with(?x, ?y) => has-border-with(?y, ?x)
    
    Such rules would create looping in the Horn knowledge base.

    When properly formatted, our knowledge base can be converted into an instance of a Horn-kb knowledge base provided using horn.scm.


    DCG Parsing with Semantic Interpretation

    Given the domain as defined above, we must then prepare a DCG grammar to parse assertions and questions in the geographical domain. If we combine syntactic parsing and semantic analysis (as discussed so far), we must be careful to tune the coverage of our grammar must include coverage, in a way similar to that provided in the grammars discussed so far:
    1. Transitive and intransitive verbs
    2. Relative clauses with unbounded dependencies (only "that" as a relative marker)
    3. Quantifiers: a, every, the
    In addition, we must support:
    1. There-is constructs: There are 124 cities in Israel
    2. Y/N questions: Is Israel located in Europe?
    3. Wh-questions: Where is Israel located?, What is the population of Israel?
    4. Wh-determiner questions: Which cities are located in Israel?
    5. How-many questions: How many regions are there in Israel?
    6. Prepositional phrases: Jerusalem is the capital of Israel
    7. Adjectives: Tel Aviv is a large city in Israel

    Putting Things Together: Question/Answering System

    Given the domain knowledge and a corresponding semantic analyzer, we can implement a question/answering system in a manner that a user can ask questions about your knowledge base, and assert new facts when desired.
    Q> How many cities are there in Israel
    125
    Q> What are the cities in Israel?
    ...
    Q> The population of Beersheba is 252000
    OK
    Q> How many people live in Beersheba?
    252000
    

    The end.


    Last modified Jan 10th, 2013