147 lines
5.2 KiB
Markdown
147 lines
5.2 KiB
Markdown
|
+++
|
|||
|
title = "`data`"
|
|||
|
weight = 1
|
|||
|
+++
|
|||
|
|
|||
|
A `data` definition defines a brand new type, which is different from
|
|||
|
every primitive type and every other type defined using a `data`
|
|||
|
definition, even if they look structurally similar. The new type defined
|
|||
|
by a `data` definition is a "sum of products", or a "union of products".
|
|||
|
|
|||
|
```
|
|||
|
topDefn ::= data typeId {tyVarId } = {summand | }[ derive ]
|
|||
|
summand ::= conId {type }
|
|||
|
summand ::= conId { { fieldDef ; }}
|
|||
|
derive ::= deriving ( { classId , })
|
|||
|
fieldDef ::= fieldId :: type
|
|||
|
```
|
|||
|
|
|||
|
The *typeId* is the name of this new type. If the *tyVarId*'s exist,
|
|||
|
they are type parameters, thereby making this new type polymorphic. In
|
|||
|
each *summand*, the *conId* is called a "constructor". You can think of
|
|||
|
them as unique *tag*'s that identify each summand. Each *conId* is
|
|||
|
followed by a specification for the fields involved in that summand
|
|||
|
(*i.e.,* the fields are the "product" within the summand). In the first
|
|||
|
way of specifying a summand, the fields are just identified by position,
|
|||
|
hence we only specify the types of the fields. In the second way of
|
|||
|
specifying a summand, the fields are named, hence we specify the field
|
|||
|
names (*fieldId*'s) and their types.
|
|||
|
|
|||
|
|
|||
|
The same constructor name may occur in more than one type. The same
|
|||
|
field name can occur in more than one type. The same field name can
|
|||
|
occur in more than one summand within the same type, but the type of the
|
|||
|
field must be the same in each summand.
|
|||
|
|
|||
|
|
|||
|
The optional *derive* clause is used as a shorthand to make this new
|
|||
|
type an instance of the *classId*'s, instead of using a separate,
|
|||
|
full-blown `instance` declaration. This can only be done for certain
|
|||
|
predefined *classId*'s: `Bits`, `Eq`, and `Bounded`. The compiler
|
|||
|
automatically derives the operations corresponding to those classes
|
|||
|
(such as `pack` and `unpack` for the `Bits` class). Type classes,
|
|||
|
instances, and `deriving` are described in more detail in sections
|
|||
|
[2.1](fixme), [4.5](fixme) and [
|
|||
|
4.6](fixme).
|
|||
|
|
|||
|
To construct a value corresponding to some `data` definition $T$, one
|
|||
|
simply applies the constructor to the appropriate number of arguments
|
|||
|
(see section [5.3](fixme){reference-type="ref"
|
|||
|
reference="sec-exprs-constrs"}); the values of those arguments become
|
|||
|
the components/fields of the data structure.
|
|||
|
|
|||
|
|
|||
|
To extract a component/field from such a value, one uses pattern
|
|||
|
matching (see section [6](fixme){reference-type="ref"
|
|||
|
reference="sec-patterns"}).
|
|||
|
|
|||
|
|
|||
|
Example:
|
|||
|
|
|||
|
```hs
|
|||
|
data Bool = False | True
|
|||
|
```
|
|||
|
|
|||
|
|
|||
|
This is a "trivial" case of a `data` definition. The type is not
|
|||
|
polymorphic (no type parameters); there are two summands with
|
|||
|
constructors `False` and `True`, and neither constructor has any fields.
|
|||
|
It is a 2-way sum of empty products. A value of type `Bool` is either
|
|||
|
the value `False` or the value `True` Definitions like these correspond
|
|||
|
to an "enum" definition in C.
|
|||
|
|
|||
|
|
|||
|
Example:
|
|||
|
|
|||
|
```hs
|
|||
|
data Operand = Register (Bit 5)
|
|||
|
| Literal (Bit 22)
|
|||
|
| Indexed (Bit 5) (Bit 5)
|
|||
|
```
|
|||
|
|
|||
|
|
|||
|
Here, the first two summands have one field each; the third has two
|
|||
|
fields. The fields are positional (no field names). The field of a
|
|||
|
`Register` value must have type Bit 5. A value of type `Operand` is
|
|||
|
either a `Register` containing a 5-bit value, or a `Literal` containing
|
|||
|
a 22-bit value, or an `Indexed` containing two 5-bit values.
|
|||
|
|
|||
|
|
|||
|
Example:
|
|||
|
|
|||
|
```hs
|
|||
|
data Maybe a = Nothing | Just a
|
|||
|
deriving (Eq, Bits)
|
|||
|
```
|
|||
|
|
|||
|
This is a very useful and commonly used type. Consider a function that,
|
|||
|
given a key, looks up a table and returns some value associated with
|
|||
|
that key. Such a function can return either `Nothing`, if the table does
|
|||
|
not contain an entry for the given key, of `Just `$v$, if the table
|
|||
|
contains $v$ associated with the key. The type is polymorphic (type
|
|||
|
parameter "`a`") because it may be used with lookup functions for
|
|||
|
integer tables, string tables, IP address tables, etc., *i.e.,* we do
|
|||
|
not want here to over-specify the type of the value $v$ at which it may
|
|||
|
be used.
|
|||
|
|
|||
|
|
|||
|
Example:
|
|||
|
|
|||
|
```hs
|
|||
|
data Instruction = Immediate { op::Op; rs::Reg; rt::CPUReg; imm::UInt16; }
|
|||
|
| Jump { op::Op; target::UInt26; }
|
|||
|
```
|
|||
|
|
|||
|
|
|||
|
An `Instruction` is either an `Immediate` or a `Jump`. In the former
|
|||
|
case, it contains a field called `op` containing a value of type `Op`, a
|
|||
|
field called `rs` containing a value of type `Reg`, a field called `rt`
|
|||
|
containing a value of type `CPUReg`, and a field called `imm` containing
|
|||
|
a value of type `UInt16`. In the latter case, it contains a field called
|
|||
|
`op` containing a value of type `Op`, and a field called `target`
|
|||
|
containing a value of type `UInt26`.
|
|||
|
|
|||
|
> **NOTE:**
|
|||
|
>
|
|||
|
> Error messages involving data type definitions sometimes show traces of
|
|||
|
> how they are handled internally. Data type definitions are translated
|
|||
|
> into a data type where each constructor has exactly one argument. The
|
|||
|
> types above translate to:
|
|||
|
>
|
|||
|
> ```hs
|
|||
|
> data Bool = False PrimUnit | True PrimUnit
|
|||
|
>
|
|||
|
> data Operand = Register (Bit 5)
|
|||
|
> | Literal (Bit 22)
|
|||
|
> | Indexed Operand_$Indexed
|
|||
|
> struct Operand_$Indexed = { _1 :: Reg 5; _2 :: Reg 5 }
|
|||
|
>
|
|||
|
> data Maybe a = Nothing PrimUnit | Just a
|
|||
|
>
|
|||
|
> data Instruction = Immediate Instruction_$Immediate
|
|||
|
> | Register Instruction_$Register
|
|||
|
>
|
|||
|
> struct Instruction_$Immediate = { op::Op; rs::Reg; rt::CPUReg; imm::UInt16; }
|
|||
|
> struct Instruction_$Register = { op::Op; target::UInt26; }
|
|||
|
> ```
|