From: Gabriel Scherer <gabriel.scherer@gmail.com>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Wanted: GADT examples: string length, counting module x
Date: Mon, 19 Mar 2012 15:43:38 +0100 [thread overview]
Message-ID: <CAPFanBHcs4V2OHB1VjCvb+xizPbQ7NQfWH_GwAvfNfnQ1wtfLQ@mail.gmail.com> (raw)
In-Reply-To: <87aa3mawg1.fsf@frosties.localnet>
I suspect you're seeing too much into the GADT as they're being added
in OCaml. Your examples are not basically about GADTs, but about
dependent types: you want to encode values (and operations on them) at
the type level to get more expressivity. This is a well-known and
extremely powerful trend in programming languages design, but it also
results in complex systems with sophisticated rules.
GADTs are not as expressive as full dependent types and, I would
argue, their "intended use" is elsewhere. GADTs are plain old
algebraic datatypes that come with equality constraints between types.
By manipulating GADTs you can reason, in a convenient way, about some
types being equal to some other in specific case. The cornerstone
example of GADT is the (a) type of well-typed expressions (a expr),
where the shape of the expression gives you more information about the
type (a) :
type _ expr =
| Int : int -> int expr
| Prod : 'a expr -> 'b expr -> ('a * 'b) expr
| IfThenElse : bool expr -> 'a expr -> 'a expr -> 'a expr
...
With a GADT type, you can write an evaluation function of type ('a
expr -> 'a), for instance, which would not be possible in a type-safe
way with the usual algebraic datatype with a phantom type parameter --
it was already possible to use encodings of GADT into first-class
modules [1], or finally tagless representations [2] to achieve the
same effect, but with more boilerplate and less efficient runtime
representations.
[1] http://okmij.org/ftp/ML/first-class-modules/
[2] http://okmij.org/ftp/tagless-final/index.html
(Remark: a lot of algorithms can be understood as the transformation
of data into code, that is evaluating a specific language of commands.
It is therefore easy to see "a type of well-typed expressions" in
almost every real-life use case.)
On the contrary, your intended applications are not directly about
equality between types, but about value and computations at the type
level. While this can be done with GADTs when extended with enough
machinery (in particular higher-kinded type variables and possibly
higher-kinded datatypes; see the work on Omega [3] for example), in
OCaml that is expected to be heavy, painful to write and read, and
hard on the tooling (type inference, error messages, etc.). At some
point the benefit of finer static checking does not compensate the
overall pain and complexity of the whole process, and I think we enter
the realm of unnecessarily complicated programs.
[3] http://web.cecs.pdx.edu/~sheard/
I'm no expert on the subject, but I doubt GADT will provide you much
more expressivity at the level of which computation you can embed in
types. I suspect you'll mostly use the same techniques as you used
before with phantom types (the question of signature redundancy being
more a technical detail than an indication of expressivity, or lack
thereof, of one or another technique). Unary natural numbers will keep
being easy to express at the type-level (see draft implementation at
the end of this email), and modulo, difference or what not will
probably stay tricky and horrible to define if you want the
*inference* engine to do all computation. There is also this technique
of using term witnesses to do the heavy type lifting and use a
logic-programming style at the type level, but that's probably not
what you're looking for.
A small draft of unary natural numbers mirrored at the type level through GADT.
https://gitorious.org/gasche-snippets/ocaml-typed-units/blobs/master/gadts_and_type_level_numbers.ml
They quickly get painful and impractical. Implementing addition is
already non-trivial -- see Mandelbaum and Stump's 2009 article "GADTs
for the OCaml masses" [4] for a more featureful, but no simpler,
presentation:
[4] https://www.cs.uiowa.edu/~astump/papers/icfp09.pdf
I consider using GADT to emulate dependent types to be an advanced,
and often counter-productive, level of "type hackery". We should
rather seek examples that fit the GADT features well, instead of
trying to stretch it to such unreasonable places. For example,
François Pottier and Nadji Gauthier's "Polymorphic typed
defunctionalization and concretization" article [5] (2004, 2006) uses
GADT to represent functions by (well-typed) data tags, and Alain
Frisch's use of GADTs for dynamic type witnesses [6]. But with such
advanced features, it is of course not easy to draw a line between
"reasonable" and "less reasonable" uses.
[5] http://gallium.inria.fr/~fpottier/publis/fpottier-gauthier-hosc.pdf
[6] http://www.lexifi.com/blog/dynamic-types
On Mon, Mar 12, 2012 at 12:30 PM, Goswin von Brederlow
<goswin-v-b@web.de> wrote:
> Hi,
>
> yesterday I compiled ocaml 3.13 and played around a bit with the GDAT
> syntax but wasn't overly successfull. Or at least I had higher hopes for
> it. So it is time to invoke the internet to come up with a better
> example. :)
>
> 1) How do I write a GADT that encodes the length of a string or array?
> How do I use that to create a string or array?
> How do I specify a function that takes a string or array of a fixed length?
> Bonus: How do I specify a function that takes a string or array of a
> certain length or longer?
>
> 2) How do I write a GADT that counts an int module x? Say for an offset
> into a byte stream to safeguard when access is aligned and when
> unaligned.
> Again with an example that creates a value and a function that uses
> it.
> Bonus: Have one function that only allows aligned access and one that
> picks the right aligned/unaligned function to use depending on
> the type.
>
> Below I've included an example for checking aligned access (1/2/4/8 byte
> aligned). First using GADT and second using old style phantom types. The
> second looks much longer because it includes the signature needed to
> make the type (...) off private. The t1/t2/t4/t8 types are just aliases
> to make the type of the other functions shorter.
>
> One thing I couldn't manage is to write a "bind" function with GADTs or
> bind takeX to a string unless I specify the full type. "takeX s" always
> switches to '_a types and then gets bound to a specific type on the
> first use and fail on the second use.
>
> On the plus side of GADTs is that you do not need a private type (and
> therefore the module signature) to make them work.
>
> MfG
> Goswin
>
> PS: Other simple examples that show the power of GADTs are welcome too.
> ----------------------------------------------------------------------
> (* Declare GADT type *)
> type z
> type u
> type _ t =
> | Zero : ((z * u) * (z * u * u * u) * (z * u * u * u * u * u * u * u)) t
> | Succ : (('a * 'b) * ('c * 'd * 'e * 'f) * ('g * 'h * 'i * 'j * 'k * 'l * 'm
> * 'n)) t -> (('b * 'a) * ('d * 'e * 'f * 'c) * ('h * 'i * 'j * 'k * 'l * 'm * 'n
> * 'g)) t
>
> (* start of stream *)
> let zero = (Zero, 0)
>
> (* advance by 1, 2, 4 or 8 *)
> let succ1 x = Succ x
> let succ2 x = succ1 (succ1 x)
> let succ4 x = succ2 (succ2 x)
> let succ8 x = x
>
> (* take 1, 2, 4 or 8 bytes with alignment restriction *)
> let take1 : type a b c d e f g h i j k l m n. string -> (((a * b) * (c * d * e *
> f) * (g * h * i * j * k * l * m * n)) t * int) -> ((((b * a) * (d * e * f * c)
> * (h * i * j * k * l * m * n * g)) t * int) * string) = fun s (t, x) -> ((succ1
> t, x+1), String.sub s x 1)
> let take2 : type c d e f g h i j k l m n. string -> (((z * u) * (c * d * e * f)
> * (g * h * i * j * k * l * m * n)) t * int) -> ((((z * u) * (e * f * c * d) * (i
> * j * k * l * m * n * g * h)) t * int) * string) = fun s (t, x) -> ((succ2 t, x
> +2), String.sub s x 2)
> let take4 : type g h i j k l m n. string -> (((z * u) * (z * u * u * u) * (g * h
> * i * j * k * l * m * n)) t * int) -> ((((z * u) * (z * u * u * u) * (k * l * m
> * n * g * h * i * j)) t * int) * string) = fun s (t, x) -> ((succ4 t, x+4), Str
> ing.sub s x 4)
> let take8 : string -> (((z * u) * (z * u * u * u) * (z * u * u * u * u * u * u *
> u)) t * int) -> ((((z * u) * (z * u * u * u) * (z * u * u * u * u * u * u * u))
> t * int) * string) = fun s (t, x) -> ((succ8 t, x+8), String.sub s x 8)
>
> (* Test string *)
> let s = "aabbccccdddddddd"
>
> (* extract things from string *)
> let foo () =
> let (off, a) = take1 s zero in
> let (off, b) = take1 s off in
> let (off, c) = take2 s off in
> let (off, d) = take4 s off in
> let (off, e) = take8 s off in
> Printf.printf "%s %s %s %s %s\n" a b c d e
>
> (* using take2/4/8 with an offset that isn't aligned gives a compile
> time type error:
> let bad () =
> let (off, a) = take1 s zero in
> take8 s off
> *)
>
> ----------------------------------------------------------------------
> module M : sig
> (* Types for aligned unaligned tracking *)
> type z
> type u
> (* The type of an offset into a stream *)
> type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off = private int
> (* Start of the stream *)
> val zero : (z, u, z, u, u, u, z, u, u, u, u, u, u, u) off
> (* Coercion to integer, same as (x :> int) *)
> val get : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> int
> (* Advance the position by 1, 2, 4 or 8 *)
> val succ : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('b , 'a, 'd, 'e, 'f, 'c, 'h, 'i, 'j, 'k, 'l, 'm, 'n, 'g) off
> val succ2 : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('a , 'b, 'e, 'f, 'c, 'd, 'i, 'j, 'k, 'l, 'm, 'n, 'g, 'h) off
> val succ4 : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('a , 'b, 'c, 'd, 'e, 'f, 'k, 'l, 'm, 'n, 'g, 'h, 'i, 'j) off
> val succ8 : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('a , 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off
> (* Aliases for shorter type names *)
> type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1 = ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> (('b, 'a, 'd, 'e, 'f, 'c, 'h, 'i, 'j, 'k, 'l, 'm, 'n, 'g) off * string)
> type ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2 = (z, u, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, 'e, 'f, 'c, 'd, 'i, 'j, 'k, 'l, 'm, 'n, 'g, 'h) off * string)
> type ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4 = (z, u, z, u, u, u, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, z, u, u, u, 'k, 'l, 'm, 'n, 'g, 'h, 'i, 'j) off * string)
> type t8 = (z, u, z, u, u, u, z, u, u, u, u, u, u, u) off -> ((z, u, z, u, u, u, z, u, u, u, u, u, u, u) off * string)
> (* Take 1, 2, 4 or 8 byte with alignment restriction *)
> val take1 : string -> ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1
> val take2 : string -> ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2
> val take4 : string -> ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4
> val take8 : string -> t8
> (* Bind all take functions to a string for easier use *)
> val bind : string -> (('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1 * ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2 * ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4 * t8)
> end = struct
> (* Types for aligned unaligned tracking *)
> type z
> type u
> (* The type of an offset into a stream *)
> type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off = int
> (* Start of the stream *)
> let zero = 0
> (* Coercion to integer, same as (x :> int) *)
> let get x = x
> (* Advance the position by 1, 2, 4 or 8 *)
> let succ x = x + 1
> let succ2 x = succ (succ x)
> let succ4 x = succ2 (succ2 x)
> let succ8 x = succ4 (succ4 x)
> (* Aliases for shorter type names *)
> type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1 = ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> (('b, 'a, 'd, 'e, 'f, 'c, 'h, 'i, 'j, 'k, 'l, 'm, 'n, 'g) off * string)
> type ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2 = (z, u, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, 'e, 'f, 'c, 'd, 'i, 'j, 'k, 'l, 'm, 'n, 'g, 'h) off * string)
> type ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4 = (z, u, z, u, u, u, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, z, u, u, u, 'k, 'l, 'm, 'n, 'g, 'h, 'i, 'j) off * string)
> type t8 = (z, u, z, u, u, u, z, u, u, u, u, u, u, u) off -> ((z, u, z, u, u, u, z, u, u, u, u, u, u, u) off * string)
> (* Take 1, 2, 4 or 8 byte with alignment restriction *)
> let take1 = fun s x -> (succ x, String.sub s (get x) 1);;
> let take2 = fun s x -> (succ2 x, String.sub s (get x) 2);;
> let take4 = fun s x -> (succ4 x, String.sub s (get x) 4);;
> let take8 = fun s x -> (succ8 x, String.sub s (get x) 8);;
> (* Bind all take functions to a string for easier use *)
> let bind s = (take1 s, take2 s, take4 s, take8 s)
> end
> open M
>
> (* Test string *)
> let s = "aabbccccdddddddd"
> let (t1, t2, t4, t8) = bind s
>
> (* extract things from string *)
> let foo () =
> let (off, a) = t1 zero in
> let (off, b) = t1 off in
> let (off, c) = t2 off in
> let (off, d) = t4 off in
> let (off, e) = t8 off in
> Printf.printf "%s %s %s %s %s\n" a b c d e
>
> (* using take2/4/8 with an offset that isn't aligned gives a compile
> time type error:
> let bad () =
> let (off, a) = take1 s zero in
> take8 s off
> *)
>
> --
> Caml-list mailing list. Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
next prev parent reply other threads:[~2012-03-19 14:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-12 11:30 Goswin von Brederlow
2012-03-19 14:43 ` Gabriel Scherer [this message]
2012-03-19 16:13 ` Jesper Louis Andersen
2012-03-19 16:22 ` Raoul Duke
2012-03-21 9:48 ` Goswin von Brederlow
2012-03-23 1:07 ` [Caml-list] GADT example for universal list Goswin von Brederlow
2012-03-22 6:01 [Caml-list] Wanted: GADT examples: string length, counting module x oleg
2012-03-22 9:04 ` Gabriel Scherer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPFanBHcs4V2OHB1VjCvb+xizPbQ7NQfWH_GwAvfNfnQ1wtfLQ@mail.gmail.com \
--to=gabriel.scherer@gmail.com \
--cc=caml-list@yquem.inria.fr \
--cc=goswin-v-b@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox