Advanced Methods in Programming Languages (201-2-4411-01)
Notes Week 9 - Michael Elhadad
previous class main page next class

Typed Languages

EOPL Chap4: "Some operations are appropriate on some types of values, and others are not. An attempt to apply an operation to inappropriate data is called a type error."

We now build analyzers to determine whether programs can produce type errors at runtime. The analyzers are built in a manner similar to the interpreters built previously.

To define a typed language, 2 steps must be applied:

  1. Define a set of types, and decision procedures to determine that an expressed value is of type t.
  2. Define an analysis procedure to assign a type to each expression in a program and verify that the program is "well-typed" - that is, that there are no expressions which would lead to a type error when evaluated. The condition is that given an expression e, the analyzer assigns type t to e if when evaluated, e returns an expressed value of type t.
There are several design decisions associated to the introduction of types in a language:
  1. What are the types
  2. Can a value have more than one type
  3. Can the type of a value be determined at runtime
  4. How is data abstraction supported - that is, the possibility to create new types which are distinct from existing types
  5. Do operations check the type validity of their arguments at runtime - or can type checking verify all types at compile (analysis) time

Type Expressions

A static type checker analyzes a program, assigns types to all expressions, and verifies that all expression can only be evaluated without causing type errors. The checker returns the type of the program or raises an error.

We distinguish between type expressions (syntactic representations of types) and types. This distinction is similar to the distinction between procedure (a syntactic expression) and closures (the object built at runtime to represent the value of a procedure).

<type-exp> ::= int  / int-type-exp()
<type-exp> ::= bool / bool-type-exp()
<type-exp> ::= ({<type-exp>}* -> <type-exp>) / proc-type-exp (arg-texps result-texp)

Mapping Expressed Values to Types

Expressed values in our language are:
  1. integer
  2. boolean
  3. closures (Procval)
The mapping of expressed values to types is the following:
  1. If v is an integer, then v is of type int
  2. If v is a boolean, then v is of type bool
  3. If v is a ProcVal that expects n arguments, and when provided arguments of types t1...tn, it returns an expression of type t, then v is of type (t1 * ... * tn -> t)

Mapping Expressions to Types

A type error can occur in the following conditions:
  1. Apply an integer or a boolean as a procedure to any arguments
  2. Apply a procedure to the wrong number of arguments
  3. Apply a primitive to the wrong types of arguments
  4. Use a non-boolean as the test in an if-expression
Other errors are not covered here because they could not be detected at analysis time.

To support type checking, we introduce a type environment (tenv) mapping each variable to a type.

The procedure type-of-expression computes the type of an expression. It is defined inductively on the syntax of the expressions.

The induction rules are:

(type-of-expression "n" tenv) = int
(type-of-expression "id" tenv) = (apply-env tenv id)
For applications (either of primitives or of defined procedures), the type computation is conditional:
IF   (type-of-expression "rator" tenv) = (t1 * t2 * ... tn -> t)
AND  (type-of-expression "rand1" tenv) = t1
...
AND  (type-of-expression "randn" tenv) = tn
THEN (type-of-expression "(rator rand1 ... randn)" tenv) = t
For an if-expression the rule is:
IF   (type-of-expression "test-exp" tenv) = bool
AND  (type-of-expression "then-exp" tenv) = t
AND  (type-of-expression "else-exp" tenv) = t
THEN (type-of-expression "(if test-exp then-exp else-exp)" tenv) = t
Attempting to type check procedure expressions, we understand that our language must be extended to support type checking: Consider an expression (proc (x1...xn) body). The type of this procedure should be (t1 * ... * tn -> t) where t is the type of the body as computed in the extended type environment where x1...xn are bound to the types t1...tn. The problem is that we do not know what the types t1...tn should be when reaching the proc expression in a top-down traversal of the expression.

There exist 2 methods to solve this problem:

  1. Type checking: we require the programmer to provide the missing type information. The proc expression is then modified to be proc (t1 x1, ..., tn xn) body. The type checker must then determine that the actual parameters of an application are properly typed given the type of the procedure.
  2. Type inference: the type system attempts to guess the type of the parameters based on the usage made of the parameters in the body of the procedure in primitive functions.

A Type Checker

The type checker is implemented in this file. The structure of the type checker is extremely similar to that of an interpreter - but instead of working on the value of expressions, we work on partial information about the type of each sub-expression in the typed expression. This is an example of a general method called abstract interpretation.
Last modified May 18, 2003 Michael Elhadad