Types and Type Systems

March 1, 2024
All notes are expanded based on

Why Data Types?

Data types play a key role in:

$Data abstraction$ in the design of programs
$Type checking$ in the analysis of programs
$Compile-time code generation$ in the translation and execution of programs
Data layout (how many words; which are data and which are pointers) dictated by type

Terminology

Type: A $type t$ defines a set of ppssible data values

E.g. $short$ in C is ${x ∣ 2^{15} - 1 \geq x \geq - 2^{15}}$
A value in this set is said to have type $t$

Type system: rules of a language assigning types to expressions

Types as Specifications

Types describe properties
Different type systems describe different properties, e.g.

Data is read write-versus read-only
Operation has authority to access data
Data came from right source
Operation might or could not raise an exception

Common type systems focus on types describing same data layout and access methods

Sound Type System

If an expression is assigned type $t$ , and it evaluates toa value $v$ , then $v$ is in the set of values defined by $t$
SML, OCAML, Scheme and Ada have sound type systems
Most implementations of C and C++ do not

Strongly Typed Language

When no appliaction of an operator to arguments can lead to a run-time type error, langauge is $stronly typed$

E.g. 1 + 2.3;;

Depends on definition of type error
C++ claimed to be strongly typed, but

Union types allow creating a value at one type and using it at another
Type coercions may cause unexpected (underirable) effects
No array bounds check (in fact, no runtime checks at all)

SML, OCAML strongly type but still must do dynamic array bounds checks, runtime type case analysis, and other checks

Static vs Dynamic Types

$Static type :$ type assigned to an expression at compile time
$Dynamic type :$ type assigned to a storage location at run time
$Statically typed language :$ static type assigned to every expression at compile time
$Dynamically typed language :$ type of an expression determined ar run time

Type Checking

When is op(arg1, …, argn) allowed?
$Type checking$ assures that operations are applied to the right number of arguments of the right types

Right type may mean same type as was specified, or may mean that there is a predefined implicit coercion that will be applied

Used to resolve overloaded operations
Type checking may be done $statically$ at compile time or $dynimically$ at run time
Dynamically typed (aka untyped) languages (e.g. LISP, Prolog) do only dynamic type checking
Statically typed languages can do most typed checking statically

Dynamic Type Checking

Performed at run-time before each operation is applied
Types of variables and operations left unspecified until run-time

Same variable may be used at different types

Data object must contain type information
Errors aren’t detected until violating application is execurted (maybe years after the code was written)

Static Type Checking

Performed after parsing, before code generation
Type of every variable and signature of every operator must be known at compile time
Can eliminate need to store type information in data object if no dynamic type checking is needed
Catches many programming arrors at earliest point
Can’t check types that depend on dynamically computed values

E.g. array bounds

Typically places restriction on languages

Garbage collection
References instead of pointers
All variables initialized when created
Variable only used at one type

Union types allow for work-arounds, but effectively introduce dynamic type checks

Type Declarations

$Type declarations :$ explicit assigment of types to variables (signatures to functios) in the code of a program

Must be checked in a strongly typed language
Often not necessary for strong typing or even static typing (depends on the type system)

Type Inference

$Type inference :$ A program analysis to assign a type to an expression from the program cobtext of the expression

Fully static type inference first introduced by Robin Miller in ML
Haskle, OCAML, SML all use type inference

Records are a problem for type inference

Format of Type Judgments

A $type judgement :$ has the form

Γ ⊢ exp : τ

$Γ$ is typing environment

Supplies the types of variables (and function names when function names are not variables)
$Γ$ is a set of the form ${x : σ, . . .}$
For any $x$ at most on $σ$ such that $(x : σ \in Γ)$

$exp$ is a program expression
$τ$ is a type to be assigned to $exp$
$⊢$ pronounced “turnstyle”, or “entails” (or satisfies or, informally, shows)

Axioms - Constants (Monomorphic)

$\frac{}{Γ ⊢ n : i n t}$ (assuming $n$ is an integer constant)

$\frac{}{Γ ⊢ true : bool} \frac{}{Γ ⊢ false : bool}$

These rules are true with any typing environmnet
$Γ, n$ are meta-variables

Axioms - Variables (Monomorphic Rule)

Notation: Let $Γ (x) = σ$ if $x : σ \in Γ$ Note: if such $σ$ exits, its unique

Variable axiom:

$\frac{}{Γ ⊢ x : σ} if Γ (x) = σ$

Simple Rules - Arithmetic (Mono)

Primitive Binary operators $(\oplus \in {+, -, *, . . .}) :$

$\frac{Γ ⊢ e_{1} : τ_{1} Γ ⊢ e_{2} : τ_{2} (\oplus) : τ_{1} \to τ_{2} \to τ_{3}}{Γ ⊢ e_{1} \oplus e_{2} : τ_{3}}$

Special case: Relations $(\sim\in {<, >, =, <=, >=})$ :

$\frac{Γ ⊢ e_{1} : τ Γ ⊢ e_{2} : τ (\sim) : τ \to τ \to bool}{Γ ⊢ e_{1} \sim e_{2} : bool}$

$For the moment, think τ is int$

Example: ${x : int} ⊢ x + 2 = 3 : bool$

What do we need to show first?

What we need fot the left and right side?

How to finish?

And

Complete Proof (type derivation)

$Bin \frac{Bin \frac{Var \frac{}{{x : int} ⊢ x : int} Const \frac{}{{x : int} ⊢ 2 : int}}{{x : int} ⊢ x + 2 : int} Const \frac{}{{x : int} ⊢ 3 : int}}{{x : int} ⊢ x + 2 = 3 : int}$

Simple Rules - Booleans

Connectives

$\frac{Γ ⊢ e_{1} : bool Γ ⊢ e_{2} : bool}{Γ ⊢ e_{1} & & e_{2} : bool}$

$\frac{Γ ⊢ e_{1} : bool Γ ⊢ e_{2} : bool}{Γ ⊢ e_{1} | | e_{2} : bool}$

Type Variables in Rules

If-then-else rule:

$\frac{Γ ⊢ e_{1} : bool Γ ⊢ e_{2} : τ Γ ⊢ e_{3} : τ}{Γ ⊢ (if e_{1} then e_{2} else e_{3}) τ}$

$τ$ is a type variable (meta-variable)
Can take any type at all
All instances in a rule application must get same type
Then branch, else branch and if-then-else must all have same type

Example derivation: if-then-else:

$Γ = {x : int, int_of_float : float \to int, y : float}$