Lecture 3: Types

Administrivia

Homework Standards

El Capitan breaks Haskell Platform

El Capitan (MacOS X 10.11) breaks the Haskell Platform installation, due to new security features in MacOS X. There is a new version of the Haskell Platform in the works that addresses this issue, I suspect it will be up soon. In the meantime, if you're using a Mac, I'd suggest either

  1. Not upgrading to El Capitan until a new version of the Haskell Platform has been released,or
  2. Running ghc/i on a lab machine remotely via ssh if you've already upgraded.

Types

XKCD has something to say about types.

One of Haskell's greatest strengths is its powerful and robust type system, by which a static analysis of your programs code ensures that the values and expressions you build are used consistently. This said, perhaps the biggest challenge of learning to Haskell is the interaction between the language of function definition and the language of type. Haskell provides considerable support through automatic type inference, but this does not exempt the programmer from the need to understand Haskell's type system, nor from having to annotate some expressions.

A standard programming discipline in Haskell is to provide a type annotation for every top-level definition. This will be our practice henceforth, and you will be expected to do likewise.

A standard notation, both in type theory and in Haskell, is

exp :: typ

This means that expression exp can be given the type typ, and that if the resulting expression is evaluated (with each of its atoms given the corresponding type), then the resulting value will have type typ.

Atomic Types

The atomic types are the basic building blocks of the type system.

Bool

Bool is the boolean type, and it's documented in Data.Bool. There are two boolean constants: True and False. There are two binary boolean operators, (&&) and (||), boolean and and or respectively, and one unary function, not. There is also a boolean constant otherwise, which has the value True, and which is often used as a last catch-all guard. As you might expect, (&&) binds more tightly (at precedence level 3) than (||) (at precedence level 2), and both bind less tightly than the relational (precedence level 4) and arithmetic operators (precedence levels 6-8).

Char

Char is the type for unicode characters, which are typically represented via character literals, e.g.,

a :: Char a = 'a' clubs :: Char clubs = '♣' -- here's a good test of your browser configuration

The Char class is documented in Data.Char, which contains a number of useful functions, including chr and ord which convert between Char and Int.

Integral Types

There are a number of integral types, but the two most important are Int and Integer. The Int type will be a fixed precision signed integer type, these days usually 64-bits, i.e., large enough to handle numbers up to positive and negative 9 quadrillion, which is large enough for most practical purposes. For impractical purposes (e.g., the values of combinatorial functions) we'll use the Integer type, which represents so-called infinite precision integers, whose magnitude is limited only by memory.

*Exercise 3.1 Unsurprisingly, the Int type is documented in Data.Int, but surprisingly, there is no Data.Integer. Where is Integer documented? Hint: the documentation comes with an index.

Real Types

Haskell has an atomic types Float (corresponding to IEEE single-precision floating point arithmetic), and Double (corresponding to IEEE double-precisions floating point arithmetic). Unless there is a very good reason for doing otherwise (e.g., as in computer graphics, where space usage is an issue), programmers will usually use Double to represent real numbers.

Non-atomic types

Lists

In Haskell, a list is a sequence of objects of the same type. For example,

a :: [Int] a = [1,2,3]

Here, we're asserting that a has the type of a list of Int, and giving it a specific value.

The Haskell type String is just a synonym for a [Char], i.e.,

type String = [Char]

The type keyword is used to introduce a new name (a synonym) for an existing type, it does not create a new type. For example, in

msg1 :: String msg1 = "You lose." msg2 :: [Char] msg2 = "I win."

the two variables, msg1 and msg2 have the same type. Note here also the use of string literals to describe specific strings. This is much more convenient than

msg3 :: String msg3 = ['Y','o','u',' ','l','o','s','e','.']

but

> msg1 == msg3 True

Note here that the "a" form of list declaration, e.g., [Int] is actually syntactic sugar for [] Int. The later form looks odd, but it's important to know about, as we'll see later.

Tuples

We briefly saw tuples in the last lecture, in the Pythagorean triple example. At first glance, tuples and lists seem pretty straightforward, e.g.,

a :: [Int] a = [1,2,3] b :: (Int,Int,Int) b = (1,2,3)

but these are actually very different concepts. Objects of list type may vary in length, but all of their elements must have exactly the same type. Objects of a tuple type all have exactly the same number of components, and corresponding components must have the same type, but different components might have different types. E.g., consider the following, which describes the numbers of wins and losses by various teams in the American League's Central Division at the end of the 2014 season:

type Standings = [(String,Int,Int)] alc :: Standings alc = [ ("DET",90,72) , ("KC ",89,73) , ("CLE",85,77) , ("CWS",73,89) , ("MIN",70,92) ]

Sadly, my Sox were tied for futility with the Cubs that year, so I don't even have the solace of them having beaten the northsiders. A wasted year on the south side...

Anyway, the definition of the Standings type synonym isn't essential, but it might be useful if we wanted to be able to define ale and alw too. This example illustrates the crucial distinctions between lists and tuples, too, because the American League West has only four teams (rather than the five of the AL East and AL Central Divisions), but that doesn't affect the type because the number of elements in a list doesn't determine it's type. On the other hand, we've represented each team by a 3-tuple, whose components are types String, Int, and Int respectively, and this cannot vary.

Exercise 3.2 We mentioned the unsugared form of list declaration, e.g., [] Int. You'll not be surprised to learn that the tuple declaration above, (String,Int,Int) is also sugared. What's the equivalent unsugared form? Hint: Cf., GHC.Tuple.

Exercise 3.3 Write an unsugared declaration for type Standings above. Hint: You'll need parentheses to indicate grouping.

Functions

A function f from type A to B has type A -> B. For example, we might define

double :: Int -> Int double x = x + x

The arrow -> associates to the right, so

(+) :: Int -> Int -> Int

is shorthand for

(+) :: Int -> (Int -> Int)

So, in an expression like this:

(+) 3 4

Which is really

((+) 3) 4

We have the following types on various subexpressions

(+) :: Int -> Int -> Int {- == Int -> (Int -> Int) -} 3 :: Int (+) 3 :: Int -> Int 4 :: Int (+) 3 4 :: Int

Note that by putting parentheses around an infix operator, we create a prefix operator. This is often useful, and is necessary in the type ascriptions of infix operators.

In general, if we have an application f x, we should have the following type pattern:

f :: A -> B x :: A f x :: B

If you're familiar with formal logic, this looks like modus ponens, and that's no accident. Basic type manipulations correspond precisely to derivations in the intuitionistic propositional calculus, per the Curry-Howard isomorphism. One thing that seems a bit odd is that application associates to the left, while arrow associates to the right. This seems surprising and a bit confusing (and it is), but reason for is that we essentially write application and arrow in reversed directions. This will become instinctive over time.

*Exercise 3.4 What is the type of (^2) in the definition below? What is the type of map?

as :: [Int] as = [1..10] bs = map (^2) as

Algebraic Data Types (ADTs)

Haskell permits the user to extend the type system via algebraic data types. An ADT consists of zero or more data constructors, each of which can be thought of as a box with a fixed number of fields of fixed type, which are used to build values. E.g., consider

data Rational = Rational Integer Integer deriving (Show)

The data keyword introduces the declaration of a new type, in this case Rational. In this case, there is a single alternative, and in it we declare the data constructor Rational, which has a place for two Integer values, the numerator and denominator respectively. Haskell can't be confused by this overloading of the name Rational, because type names are only used in type expressions, and data constructors are never used in type expressions. Note that types and data constructors always begin with a capital letter: this is not a convention, the language requires it. The deriving line allows values of this type to be printed naturally.

We can construct values of this type by using the constructor, e.g.,

> let oneHalf = Rational 2 4 > oneHalf Rational 2 4

We can use pattern matching to build functions that take ADTs as arguments, e.g.,

lowestTerms :: Rational -> Rational lowestTerms (Rational numerator denominator) = Rational (div numerator divisor) (div denominator divisor) where divisor = gcd numerator denominator

whence

> lowestTerms oneHalf Rational 1 2

There are a couple of things to note here. The first is that we use div for division of integral values, the second is the gcd function from the Prelude, which computes the greatest common divisor of two integral values.

Exercise 3.5 The implementation of lowestTerms above doesn't accomplish exactly what might be hoped here, consider,

> let oneHalf = Rational (-2) (-4) > lowestTerms oneHalf Rational (-1) (-2)

This can be fixed, most simply by accounting for the sign of the denominator using the signum function from the Prelude. Provide a corrected version. There's nasty issue in that the Prelude also includes a Rational type which conflicts with this definition. Use the following starting code, which includes additional declarations that enable us to work around this problem:

A somewhat more complicated example is the following complex number type, which allows for both rectangular and polar notation:

data Complex = Rectangular Double Double | Polar Double Double

To define a function that takes a Complex argument, we'll provide an equation for each case, just as we did for lists, e.g.,

magnitude :: Complex -> Double magnitude (Rectangular real imaginary) = sqrt $ real^2 + imaginary^2 magnitude (Polar r theta) = r

whence

> magnitude (Rectangular 3 4) 5.0 > magnitude (Polar 3 pi) 3.0

One use of algebraic data types can be to associate units with scalar quantities in a natural way, e.g., if we're dealing with amount of money, it might be useful to use Int or even Integer values. It matters quite a lot whether this amount is given in dollars, Euro, or some other currency. If our bank is loaning money in Euro, and accepting payment in South Korean Won, on a nominal basis, it's going to crash and burn, much as the Mars Climate Orbiter did when it was sent a command in pound-seconds rather than expected newton-seconds. In such cases, we can use data types like:

data Euro = EuroCent Integer -- 1 Euro = 100 EuroCent data Dollar = Mills Integer -- 1 Dollar = 1000 Mills data Won = Jeun Integer -- 1 Won = 100 Jeon

We will soon learn language features that make this idea much more useful.