147 lines
5.2 KiB
Markdown
147 lines
5.2 KiB
Markdown
+++
|
||
title = "`data`"
|
||
weight = 1
|
||
+++
|
||
|
||
A `data` definition defines a brand new type, which is different from
|
||
every primitive type and every other type defined using a `data`
|
||
definition, even if they look structurally similar. The new type defined
|
||
by a `data` definition is a "sum of products", or a "union of products".
|
||
|
||
```
|
||
topDefn ::= data typeId {tyVarId } = {summand | }[ derive ]
|
||
summand ::= conId {type }
|
||
summand ::= conId { { fieldDef ; }}
|
||
derive ::= deriving ( { classId , })
|
||
fieldDef ::= fieldId :: type
|
||
```
|
||
|
||
The *typeId* is the name of this new type. If the *tyVarId*'s exist,
|
||
they are type parameters, thereby making this new type polymorphic. In
|
||
each *summand*, the *conId* is called a "constructor". You can think of
|
||
them as unique *tag*'s that identify each summand. Each *conId* is
|
||
followed by a specification for the fields involved in that summand
|
||
(*i.e.,* the fields are the "product" within the summand). In the first
|
||
way of specifying a summand, the fields are just identified by position,
|
||
hence we only specify the types of the fields. In the second way of
|
||
specifying a summand, the fields are named, hence we specify the field
|
||
names (*fieldId*'s) and their types.
|
||
|
||
|
||
The same constructor name may occur in more than one type. The same
|
||
field name can occur in more than one type. The same field name can
|
||
occur in more than one summand within the same type, but the type of the
|
||
field must be the same in each summand.
|
||
|
||
|
||
The optional *derive* clause is used as a shorthand to make this new
|
||
type an instance of the *classId*'s, instead of using a separate,
|
||
full-blown `instance` declaration. This can only be done for certain
|
||
predefined *classId*'s: `Bits`, `Eq`, and `Bounded`. The compiler
|
||
automatically derives the operations corresponding to those classes
|
||
(such as `pack` and `unpack` for the `Bits` class). Type classes,
|
||
instances, and `deriving` are described in more detail in sections
|
||
[2.1](fixme), [4.5](fixme) and [
|
||
4.6](fixme).
|
||
|
||
To construct a value corresponding to some `data` definition $T$, one
|
||
simply applies the constructor to the appropriate number of arguments
|
||
(see section [5.3](fixme){reference-type="ref"
|
||
reference="sec-exprs-constrs"}); the values of those arguments become
|
||
the components/fields of the data structure.
|
||
|
||
|
||
To extract a component/field from such a value, one uses pattern
|
||
matching (see section [6](fixme){reference-type="ref"
|
||
reference="sec-patterns"}).
|
||
|
||
|
||
Example:
|
||
|
||
```hs
|
||
data Bool = False | True
|
||
```
|
||
|
||
|
||
This is a "trivial" case of a `data` definition. The type is not
|
||
polymorphic (no type parameters); there are two summands with
|
||
constructors `False` and `True`, and neither constructor has any fields.
|
||
It is a 2-way sum of empty products. A value of type `Bool` is either
|
||
the value `False` or the value `True` Definitions like these correspond
|
||
to an "enum" definition in C.
|
||
|
||
|
||
Example:
|
||
|
||
```hs
|
||
data Operand = Register (Bit 5)
|
||
| Literal (Bit 22)
|
||
| Indexed (Bit 5) (Bit 5)
|
||
```
|
||
|
||
|
||
Here, the first two summands have one field each; the third has two
|
||
fields. The fields are positional (no field names). The field of a
|
||
`Register` value must have type Bit 5. A value of type `Operand` is
|
||
either a `Register` containing a 5-bit value, or a `Literal` containing
|
||
a 22-bit value, or an `Indexed` containing two 5-bit values.
|
||
|
||
|
||
Example:
|
||
|
||
```hs
|
||
data Maybe a = Nothing | Just a
|
||
deriving (Eq, Bits)
|
||
```
|
||
|
||
This is a very useful and commonly used type. Consider a function that,
|
||
given a key, looks up a table and returns some value associated with
|
||
that key. Such a function can return either `Nothing`, if the table does
|
||
not contain an entry for the given key, of `Just `$v$, if the table
|
||
contains $v$ associated with the key. The type is polymorphic (type
|
||
parameter "`a`") because it may be used with lookup functions for
|
||
integer tables, string tables, IP address tables, etc., *i.e.,* we do
|
||
not want here to over-specify the type of the value $v$ at which it may
|
||
be used.
|
||
|
||
|
||
Example:
|
||
|
||
```hs
|
||
data Instruction = Immediate { op::Op; rs::Reg; rt::CPUReg; imm::UInt16; }
|
||
| Jump { op::Op; target::UInt26; }
|
||
```
|
||
|
||
|
||
An `Instruction` is either an `Immediate` or a `Jump`. In the former
|
||
case, it contains a field called `op` containing a value of type `Op`, a
|
||
field called `rs` containing a value of type `Reg`, a field called `rt`
|
||
containing a value of type `CPUReg`, and a field called `imm` containing
|
||
a value of type `UInt16`. In the latter case, it contains a field called
|
||
`op` containing a value of type `Op`, and a field called `target`
|
||
containing a value of type `UInt26`.
|
||
|
||
> **NOTE:**
|
||
>
|
||
> Error messages involving data type definitions sometimes show traces of
|
||
> how they are handled internally. Data type definitions are translated
|
||
> into a data type where each constructor has exactly one argument. The
|
||
> types above translate to:
|
||
>
|
||
> ```hs
|
||
> data Bool = False PrimUnit | True PrimUnit
|
||
>
|
||
> data Operand = Register (Bit 5)
|
||
> | Literal (Bit 22)
|
||
> | Indexed Operand_$Indexed
|
||
> struct Operand_$Indexed = { _1 :: Reg 5; _2 :: Reg 5 }
|
||
>
|
||
> data Maybe a = Nothing PrimUnit | Just a
|
||
>
|
||
> data Instruction = Immediate Instruction_$Immediate
|
||
> | Register Instruction_$Register
|
||
>
|
||
> struct Instruction_$Immediate = { op::Op; rs::Reg; rt::CPUReg; imm::UInt16; }
|
||
> struct Instruction_$Register = { op::Op; target::UInt26; }
|
||
> ```
|