Advanced Methods in Programming Languages (201-2-4411-01)
Notes Week 9 - Michael Elhadad
Typed Languages
EOPL Chap4: "Some operations are appropriate on some types of values, and others are not.
An attempt to apply an operation to inappropriate data is called a type error."
We now build analyzers to determine whether programs can produce type errors at runtime.
The analyzers are built in a manner similar to the interpreters built previously.
To define a typed language, 2 steps must be applied:
- Define a set of types, and decision procedures to determine that an expressed value is of type t.
- Define an analysis procedure to assign a type to each expression in a program and verify that
the program is "well-typed" - that is, that there are no expressions which would lead to a type
error when evaluated. The condition is that given an expression e, the analyzer assigns type t to e if
when evaluated, e returns an expressed value of type t.
-
There are several design decisions associated to the introduction of types in a language:
- What are the types
- Can a value have more than one type
- Can the type of a value be determined at runtime
- How is data abstraction supported - that is,
the possibility to create new types which are distinct from existing types
- Do operations check the type validity of their arguments at runtime - or can
type checking verify all types at compile (analysis) time
Type Expressions
A static type checker analyzes a program, assigns types to all expressions, and verifies
that all expression can only be evaluated without causing type errors. The checker returns the
type of the program or raises an error.
We distinguish between type expressions (syntactic representations of types) and types.
This distinction is similar to the distinction between procedure (a syntactic expression)
and closures (the object built at runtime to represent the value of a procedure).
<type-exp> ::= int / int-type-exp()
<type-exp> ::= bool / bool-type-exp()
<type-exp> ::= ({<type-exp>}* -> <type-exp>) / proc-type-exp (arg-texps result-texp)
Mapping Expressed Values to Types
Expressed values in our language are:
- integer
- boolean
- closures (Procval)
The mapping of expressed values to types is the following:
- If v is an integer, then v is of type int
- If v is a boolean, then v is of type bool
- If v is a ProcVal that expects n arguments, and when provided arguments of types t1...tn,
it returns an expression of type t, then v is of type (t1 * ... * tn -> t)
Mapping Expressions to Types
A type error can occur in the following conditions:
- Apply an integer or a boolean as a procedure to any arguments
- Apply a procedure to the wrong number of arguments
- Apply a primitive to the wrong types of arguments
- Use a non-boolean as the test in an if-expression
Other errors are not covered here because they could not be detected at analysis time.
To support type checking, we introduce a type environment (tenv) mapping each
variable to a type.
The procedure type-of-expression computes the type of an expression. It is defined
inductively on the syntax of the expressions.
The induction rules are:
(type-of-expression "n" tenv) = int
(type-of-expression "id" tenv) = (apply-env tenv id)
For applications (either of primitives or of defined procedures), the type
computation is conditional:
IF (type-of-expression "rator" tenv) = (t1 * t2 * ... tn -> t)
AND (type-of-expression "rand1" tenv) = t1
...
AND (type-of-expression "randn" tenv) = tn
THEN (type-of-expression "(rator rand1 ... randn)" tenv) = t
For an if-expression the rule is:
IF (type-of-expression "test-exp" tenv) = bool
AND (type-of-expression "then-exp" tenv) = t
AND (type-of-expression "else-exp" tenv) = t
THEN (type-of-expression "(if test-exp then-exp else-exp)" tenv) = t
Attempting to type check procedure expressions, we understand that our language
must be extended to support type checking: Consider an expression
(proc (x1...xn) body). The type of this procedure should be
(t1 * ... * tn -> t) where t is the type of the body as computed in the
extended type environment where x1...xn are bound to the types t1...tn.
The problem is that we do not know what the types t1...tn should be when
reaching the proc expression in a top-down traversal of the expression.
There exist 2 methods to solve this problem:
- Type checking: we require the programmer to provide the missing type information.
The proc expression is then modified to be proc (t1 x1, ..., tn xn) body.
The type checker must then determine that the actual parameters of an application
are properly typed given the type of the procedure.
- Type inference: the type system attempts to guess the type of the parameters
based on the usage made of the parameters in the body of the procedure in
primitive functions.
A Type Checker
The type checker is implemented in
this file.
The structure of the type checker is extremely similar to that of an interpreter - but instead
of working on the value of expressions, we work on partial information about the type of each
sub-expression in the typed expression. This is an example of a general method called
abstract interpretation.
Last modified May 18, 2003
Michael Elhadad