Types and Type Systems
Why Data Types?
- Data types play a key role in:
- \(\textcolor{blue}{\text{Data abstraction}}\) in the design of programs
- \(\textcolor{blue}{\text{Type checking}}\) in the analysis of programs
- \(\textcolor{blue}{\text{Compile-time code generation}}\) in the translation and execution of programs
- Data layout (how many words; which are data and which are pointers) dictated by type
Terminology
- Type: A \(\textcolor{blue}{\text{type}} \:\: \textcolor{red}{\text{t}}\) defines a set of ppssible data values
- E.g. \(\textcolor{red}{\text{short}}\) in C is \(\{x \mid 2^{15} - 1 \geq x \geq -2^{15}\}\)
- A value in this set is said to have type \(\textcolor{blue}{t}\)
- Type system: rules of a language assigning types to expressions
Types as Specifications
- Types describe properties
- Different type systems describe different properties, e.g.
- Data is read write-versus read-only
- Operation has authority to access data
- Data came from right source
- Operation might or could not raise an exception
- Common type systems focus on types describing same data layout and access methods
Sound Type System
- If an expression is assigned type \(\textcolor{red}{\text{t}}\), and it evaluates toa value \(\textcolor{red}{\text{v}}\), then \(\textcolor{red}{\text{v}}\) is in the set of values defined by \(\textcolor{red}{\text{t}}\)
- SML, OCAML, Scheme and Ada have sound type systems
- Most implementations of C and C++ do not
Strongly Typed Language
- When no appliaction of an operator to arguments can lead to a run-time type error, langauge is \(\textcolor{red}{\text{stronly typed}}\)
- E.g. 1 + 2.3;;
- Depends on definition of type error
- C++ claimed to be strongly typed, but
- Union types allow creating a value at one type and using it at another
- Type coercions may cause unexpected (underirable) effects
- No array bounds check (in fact, no runtime checks at all)
- SML, OCAML strongly type but still must do dynamic array bounds checks, runtime type case analysis, and other checks
Static vs Dynamic Types
- \(\textcolor{red}{\text{Static type}}:\) type assigned to an expression at compile time
- \(\textcolor{red}{\text{Dynamic type}}:\) type assigned to a storage location at run time
- \(\textcolor{red}{\text{Statically typed language}}:\) static type assigned to every expression at compile time
- \(\textcolor{red}{\text{Dynamically typed language}}:\) type of an expression determined ar run time
Type Checking
- When is op(arg1, …, argn) allowed?
- \(\textcolor{red}{\text{Type checking}}\) assures that operations are applied to the right number of arguments of the right types
- Right type may mean same type as was specified, or may mean that there is a predefined implicit coercion that will be applied
- Used to resolve overloaded operations
- Type checking may be done \(\textcolor{red}{\text{statically}}\) at compile time or \(\textcolor{red}{\text{dynimically}}\) at run time
- Dynamically typed (aka untyped) languages (e.g. LISP, Prolog) do only dynamic type checking
- Statically typed languages can do most typed checking statically
Dynamic Type Checking
- Performed at run-time before each operation is applied
- Types of variables and operations left unspecified until run-time
- Same variable may be used at different types
- Data object must contain type information
- Errors aren’t detected until violating application is execurted (maybe years after the code was written)
Static Type Checking
- Performed after parsing, before code generation
- Type of every variable and signature of every operator must be known at compile time
- Can eliminate need to store type information in data object if no dynamic type checking is needed
- Catches many programming arrors at earliest point
- Can’t check types that depend on dynamically computed values
- E.g. array bounds
- Typically places restriction on languages
- Garbage collection
- References instead of pointers
- All variables initialized when created
- Variable only used at one type
- Union types allow for work-arounds, but effectively introduce dynamic type checks
Type Declarations
- \(\textcolor{red}{\text{Type declarations}}:\) explicit assigment of types to variables (signatures to functios) in the code of a program
- Must be checked in a strongly typed language
- Often not necessary for strong typing or even static typing (depends on the type system)
Type Inference
- \(\textcolor{red}{\text{Type inference}}:\) A program analysis to assign a type to an expression from the program cobtext of the expression
- Fully static type inference first introduced by Robin Miller in ML
- Haskle, OCAML, SML all use type inference
- Records are a problem for type inference
Format of Type Judgments
- A \(\textcolor{red}{\text{type judgement}}:\) has the form \[ \textcolor{blue}{\Gamma \vdash \text{ exp } : \tau}\]
- \(\textcolor{blue}{\Gamma }\) is typing environment
- Supplies the types of variables (and function names when function names are not variables)
- \(\textcolor{blue}{\Gamma }\) is a set of the form \(\textcolor{blue}{\{x: \sigma, ...\} }\)
- For any \(\textcolor{blue}{x}\) at most on \(\textcolor{blue}{\sigma}\) such that \(\textcolor{blue}{(x : \sigma \in \Gamma)}\)
- \(\textcolor{blue}{\text{exp}}\) is a program expression
- \(\textcolor{blue}{\tau}\) is a type to be assigned to \(\textcolor{blue}{\text{exp}}\)
- \(\textcolor{blue}{\vdash}\) pronounced “turnstyle”, or “entails” (or satisfies or, informally, shows)
Axioms - Constants (Monomorphic)
\(\textcolor{blue}{\Large\frac{}{\Gamma \: \vdash \: n \: : \: int}}\) (assuming \(\textcolor{blue}{n}\) is an integer constant)
\[ \textcolor{blue}{\frac{}{\Gamma \: \vdash \: \text{ true } \: : \: \text{ bool }}\qquad\qquad \frac{}{\Gamma \: \vdash \: \text{ false } \: : \: \text{ bool }}} \]
- These rules are true with any typing environmnet
- \(\textcolor{blue}{\Gamma, \: n}\) are meta-variables
Axioms - Variables (Monomorphic Rule)
Notation: Let \(\textcolor{blue}{\Gamma(x) = \sigma}\) if \(\textcolor{blue}{x : \sigma \in \Gamma}\) Note: if such \(\textcolor{blue}{\sigma}\) exits, its unique
Variable axiom:
\[ \textcolor{blue}{\frac{}{\Gamma \: \vdash \: x \: : \: \sigma}} \qquad \qquad \text{ if } \textcolor{blue}{\Gamma(x) = \sigma}\]
Simple Rules - Arithmetic (Mono)
Primitive Binary operators \((\textcolor{blue}{\oplus \in \{+, -, *, ...\}}):\)
\[\textcolor{blue}{\large\frac{ \Gamma \: \vdash \: e_1 \: : \: \tau_1 \quad \Gamma \: \vdash \: e_2 \: : \: \tau_2 \quad (\oplus): \: \tau_1 \: \rightarrow \: \tau_2 \: \rightarrow \: \tau_3 }{\Gamma \: \vdash \: e_1 \: \oplus \: e_2 \: : \: \tau_3}}\]
Special case: Relations \((\textcolor{blue}{\sim \in \{<, >, =, <=, >=\}})\):
\[\textcolor{blue}{\large\frac{ \Gamma \: \vdash \: e_1 \: : \: \tau \quad \Gamma \: \vdash e_2 \: : \: \tau \quad (\sim) \: : \: \tau \: \rightarrow \: \tau \: \rightarrow \: \text{ bool } }{\Gamma \:\: \vdash \: e_1 \sim e_2 : \text{ bool }}}\]
\[ \text{For the moment, think } \textcolor{blue}{\tau} \text{ is int}\]
Example: \(\scriptsize{\{x : \text{int}\} \vdash x + 2 = 3 : \text{ bool}}\)
What do we need to show first?
What we need fot the left and right side?
How to finish?
And
Complete Proof (type derivation)
\[ \large{\text{Bin}\frac{ \text{Bin}\frac{ \text{Var}\frac{}{\{x \: : \: \text{int} \} \: \vdash \: x \: : \: \text{int}} \quad \text{Const}\frac{}{\{x \: : \: \text{int} \} \: \vdash \: 2 \: : \: \text{int}} }{\{x \: : \: \text{int} \} \: \vdash \: x + 2 \: : \: \text{int}} \quad \text{Const}\frac{}{\{x : \text{int} \} \vdash 3 : \text{int}} }{\{x : \text{int} \} \vdash x + 2 = 3 : \text{int}}} \]
Simple Rules - Booleans
Connectives
\[\textcolor{blue}{\large{\frac{\Gamma \: \vdash \: e_1 \: : \: \text{bool} \quad \Gamma \: \vdash \: e_2 \: : \: \text{bool}}{\Gamma \: \vdash \: e_1 \: \&\& \: e_2 \: : \: \text{ bool}}}}\]
\[\textcolor{blue}{\large{\frac{\Gamma \: \vdash \: e_1 \: : \: \text{bool} \quad \Gamma \: \vdash \: e_2 \: : \: \text{bool}}{\Gamma \: \vdash \: e_1 \: || \: e_2 \: : \: \text{ bool}}}}\]
Type Variables in Rules
- If-then-else rule:
\[\textcolor{blue}{\frac{\Gamma \: \vdash \: e_1 \: :\text{ bool } \quad \Gamma \: \vdash \: e_2 \: : \: \tau \quad \Gamma \: \vdash \: e_3 \: : \: \tau}{\Gamma \: \vdash \: (\text{if } e_1 \text{ then } e_2 \text{ else } e_3) \: \tau}}\]
- \(\tau\) is a type variable (meta-variable)
- Can take any type at all
- All instances in a rule application must get same type
- Then branch, else branch and if-then-else must all have same type
Example derivation: if-then-else:
\(\large{\textcolor{blue}{\Gamma = \{x : \text{int, int\_of\_float : float } \rightarrow \text{ int, y : float }\}}}\)