From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q2JEiPYm032681 for ; Mon, 19 Mar 2012 15:44:26 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArICAKFFZ09KfVK0kGdsb2JhbABCFqRykUQIIgEBAQEJCQ0HFAQjggkBAQECAhICExkBGw8DCwEDDAYFCwUIDSEiAREBBQEKEgYTEgcJh2gLoF8KjBKCcYRyP4EMAQULkHEEkjqDLosygxc9hAmBWw X-IronPort-AV: E=Sophos;i="4.73,611,1325458800"; d="scan'208";a="150157083" Received: from mail-we0-f180.google.com ([74.125.82.180]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 19 Mar 2012 15:44:21 +0100 Received: by werf3 with SMTP id f3so9492497wer.39 for ; Mon, 19 Mar 2012 07:44:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=Honk0xavI4CWf8TuA4BM/nIZTMlYUSyDYnSTOED5r6I=; b=ec1i0v9bZ4eZpzuqL7gOy81Sp83FzXIKIOGFUw43EvjXgntLDHjdybXQOCuV7RE6lT qvXg+mwi7Tgrkrj6J+QJiiFtHnSVHXabtqrmdYbE8PvO/m/E2PJRDGoxa4042USy7A73 av1CI2+XYndZGhRBF/Cj2dNm0pLlaQcQBolhB9BwPqRbzzxjUEv+DeoJW4mLxbaAoxP5 a+6Kf8pA84CILnFlJODaVJCaRv0frAtF3c+BvylQNlbE4p3Wbc1A+CCU9vvocRIlUP3g UmgDgn5fVlAaZZG9eYC8Tp46CFLDpKZ132t2TU7OcrzdedNuBVN+i4/03jNGGrLbBUjM ZRAA== Received: by 10.50.185.230 with SMTP id ff6mr6018427igc.70.1332168258404; Mon, 19 Mar 2012 07:44:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.3.38 with HTTP; Mon, 19 Mar 2012 07:43:38 -0700 (PDT) In-Reply-To: <87aa3mawg1.fsf@frosties.localnet> References: <87aa3mawg1.fsf@frosties.localnet> From: Gabriel Scherer Date: Mon, 19 Mar 2012 15:43:38 +0100 Message-ID: To: Goswin von Brederlow Cc: caml-list@yquem.inria.fr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id q2JEiPYm032681 Subject: Re: [Caml-list] Wanted: GADT examples: string length, counting module x I suspect you're seeing too much into the GADT as they're being added in OCaml. Your examples are not basically about GADTs, but about dependent types: you want to encode values (and operations on them) at the type level to get more expressivity. This is a well-known and extremely powerful trend in programming languages design, but it also results in complex systems with sophisticated rules. GADTs are not as expressive as full dependent types and, I would argue, their "intended use" is elsewhere. GADTs are plain old algebraic datatypes that come with equality constraints between types. By manipulating GADTs you can reason, in a convenient way, about some types being equal to some other in specific case. The cornerstone example of GADT is the (a) type of well-typed expressions (a expr), where the shape of the expression gives you more information about the type (a) : type _ expr = | Int : int -> int expr | Prod : 'a expr -> 'b expr -> ('a * 'b) expr | IfThenElse : bool expr -> 'a expr -> 'a expr -> 'a expr ... With a GADT type, you can write an evaluation function of type ('a expr -> 'a), for instance, which would not be possible in a type-safe way with the usual algebraic datatype with a phantom type parameter -- it was already possible to use encodings of GADT into first-class modules [1], or finally tagless representations [2] to achieve the same effect, but with more boilerplate and less efficient runtime representations. [1] http://okmij.org/ftp/ML/first-class-modules/ [2] http://okmij.org/ftp/tagless-final/index.html (Remark: a lot of algorithms can be understood as the transformation of data into code, that is evaluating a specific language of commands. It is therefore easy to see "a type of well-typed expressions" in almost every real-life use case.) On the contrary, your intended applications are not directly about equality between types, but about value and computations at the type level. While this can be done with GADTs when extended with enough machinery (in particular higher-kinded type variables and possibly higher-kinded datatypes; see the work on Omega [3] for example), in OCaml that is expected to be heavy, painful to write and read, and hard on the tooling (type inference, error messages, etc.). At some point the benefit of finer static checking does not compensate the overall pain and complexity of the whole process, and I think we enter the realm of unnecessarily complicated programs. [3] http://web.cecs.pdx.edu/~sheard/ I'm no expert on the subject, but I doubt GADT will provide you much more expressivity at the level of which computation you can embed in types. I suspect you'll mostly use the same techniques as you used before with phantom types (the question of signature redundancy being more a technical detail than an indication of expressivity, or lack thereof, of one or another technique). Unary natural numbers will keep being easy to express at the type-level (see draft implementation at the end of this email), and modulo, difference or what not will probably stay tricky and horrible to define if you want the *inference* engine to do all computation. There is also this technique of using term witnesses to do the heavy type lifting and use a logic-programming style at the type level, but that's probably not what you're looking for. A small draft of unary natural numbers mirrored at the type level through GADT. https://gitorious.org/gasche-snippets/ocaml-typed-units/blobs/master/gadts_and_type_level_numbers.ml They quickly get painful and impractical. Implementing addition is already non-trivial -- see Mandelbaum and Stump's 2009 article "GADTs for the OCaml masses" [4] for a more featureful, but no simpler, presentation: [4] https://www.cs.uiowa.edu/~astump/papers/icfp09.pdf I consider using GADT to emulate dependent types to be an advanced, and often counter-productive, level of "type hackery". We should rather seek examples that fit the GADT features well, instead of trying to stretch it to such unreasonable places. For example, François Pottier and Nadji Gauthier's "Polymorphic typed defunctionalization and concretization" article [5] (2004, 2006) uses GADT to represent functions by (well-typed) data tags, and Alain Frisch's use of GADTs for dynamic type witnesses [6]. But with such advanced features, it is of course not easy to draw a line between "reasonable" and "less reasonable" uses. [5] http://gallium.inria.fr/~fpottier/publis/fpottier-gauthier-hosc.pdf [6] http://www.lexifi.com/blog/dynamic-types On Mon, Mar 12, 2012 at 12:30 PM, Goswin von Brederlow wrote: > Hi, > > yesterday I compiled ocaml 3.13 and played around a bit with the GDAT > syntax but wasn't overly successfull. Or at least I had higher hopes for > it. So it is time to invoke the internet to come up with a better > example. :) > > 1) How do I write a GADT that encodes the length of a string or array? >   How do I use that to create a string or array? >   How do I specify a function that takes a string or array of a fixed length? >   Bonus: How do I specify a function that takes a string or array of a >          certain length or longer? > > 2) How do I write a GADT that counts an int module x? Say for an offset >   into a byte stream to safeguard when access is aligned and when >   unaligned. >   Again with an example that creates a value and a function that uses >   it. >   Bonus: Have one function that only allows aligned access and one that >          picks the right aligned/unaligned function to use depending on >          the type. > > Below I've included an example for checking aligned access (1/2/4/8 byte > aligned). First using GADT and second using old style phantom types. The > second looks much longer because it includes the signature needed to > make the type (...) off private. The t1/t2/t4/t8 types are just aliases > to make the type of the other functions shorter. > > One thing I couldn't manage is to write a "bind" function with GADTs or > bind takeX to a string unless I specify the full type. "takeX s" always > switches to '_a types and then gets bound to a specific type on the > first use and fail on the second use. > > On the plus side of GADTs is that you do not need a private type (and > therefore the module signature) to make them work. > > MfG >        Goswin > > PS: Other simple examples that show the power of GADTs are welcome too. > ---------------------------------------------------------------------- > (* Declare GADT type *) > type z > type u > type _ t = >  | Zero : ((z * u) * (z * u * u * u) * (z * u * u * u * u * u * u * u)) t >  | Succ : (('a * 'b) * ('c * 'd * 'e * 'f) * ('g * 'h * 'i * 'j * 'k * 'l * 'm > * 'n)) t -> (('b * 'a) * ('d * 'e * 'f * 'c) * ('h * 'i * 'j * 'k * 'l * 'm * 'n >  * 'g)) t > > (* start of stream *) > let zero = (Zero, 0) > > (* advance by 1, 2, 4 or 8 *) > let succ1 x = Succ x > let succ2 x = succ1 (succ1 x) > let succ4 x = succ2 (succ2 x) > let succ8 x = x > > (* take 1, 2, 4 or 8 bytes with alignment restriction *) > let take1 : type a b c d e f g h i j k l m n. string -> (((a * b) * (c * d * e * >  f) * (g * h * i * j * k * l * m * n)) t * int) -> ((((b * a) * (d * e * f * c) > * (h * i * j * k * l * m * n * g)) t * int) * string) = fun s (t, x) -> ((succ1 > t, x+1), String.sub s x 1) > let take2 : type c d e f g h i j k l m n. string -> (((z * u) * (c * d * e * f) > * (g * h * i * j * k * l * m * n)) t * int) -> ((((z * u) * (e * f * c * d) * (i >  * j * k * l * m * n * g * h)) t * int) * string) = fun s (t, x) -> ((succ2 t, x > +2), String.sub s x 2) > let take4 : type g h i j k l m n. string -> (((z * u) * (z * u * u * u) * (g * h >  * i * j * k * l * m * n)) t * int) -> ((((z * u) * (z * u * u * u) * (k * l * m >  * n * g * h * i * j)) t * int) * string) = fun s (t, x) -> ((succ4 t, x+4), Str > ing.sub s x 4) > let take8 : string -> (((z * u) * (z * u * u * u) * (z * u * u * u * u * u * u * >  u)) t * int) -> ((((z * u) * (z * u * u * u) * (z * u * u * u * u * u * u * u)) >  t * int) * string) = fun s (t, x) -> ((succ8 t, x+8), String.sub s x 8) > > (* Test string *) > let s = "aabbccccdddddddd" > > (* extract things from string *) > let foo () = >  let (off, a) = take1 s zero in >  let (off, b) = take1 s off in >  let (off, c) = take2 s off in >  let (off, d) = take4 s off in >  let (off, e) = take8 s off in >  Printf.printf "%s %s %s %s %s\n" a b c d e > > (* using take2/4/8 with an offset that isn't aligned gives a compile >   time type error: > let bad () = >  let (off, a) = take1 s zero in >  take8 s off >  *) > > ---------------------------------------------------------------------- > module M : sig >  (* Types for aligned unaligned tracking *) >  type z >  type u >  (* The type of an offset into a stream *) >  type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off = private int >  (* Start of the stream *) >  val zero : (z, u, z, u, u, u, z, u, u, u, u, u, u, u) off >  (* Coercion to integer, same as (x :> int) *) >  val get : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> int >  (* Advance the position by 1, 2, 4 or 8 *) >  val succ : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('b , 'a, 'd, 'e, 'f, 'c, 'h, 'i, 'j, 'k, 'l, 'm, 'n, 'g) off >  val succ2 : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('a , 'b, 'e, 'f, 'c, 'd, 'i, 'j, 'k, 'l, 'm, 'n, 'g, 'h) off >  val succ4 : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('a , 'b, 'c, 'd, 'e, 'f, 'k, 'l, 'm, 'n, 'g, 'h, 'i, 'j) off >  val succ8 : ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ('a , 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off >  (* Aliases for shorter type names *) >  type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1 = ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> (('b, 'a, 'd, 'e, 'f, 'c, 'h, 'i, 'j, 'k, 'l, 'm, 'n, 'g) off * string) >  type ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2 = (z, u, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, 'e, 'f, 'c, 'd, 'i, 'j, 'k, 'l, 'm, 'n, 'g, 'h) off * string) >  type ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4 = (z, u, z, u, u, u, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, z, u, u, u, 'k, 'l, 'm, 'n, 'g, 'h, 'i, 'j) off * string) >  type t8 = (z, u, z, u, u, u, z, u, u, u, u, u, u, u) off -> ((z, u, z, u, u, u, z, u, u, u, u, u, u, u) off * string) >  (* Take 1, 2, 4 or 8 byte with alignment restriction *) >  val take1 : string -> ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1 >  val take2 : string -> ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2 >  val take4 : string -> ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4 >  val take8 : string -> t8 >  (* Bind all take functions to a string for easier use *) >  val bind : string -> (('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1 * ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2 * ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4 * t8) > end = struct >  (* Types for aligned unaligned tracking *) >  type z >  type u >  (* The type of an offset into a stream *) >  type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off = int >  (* Start of the stream *) >  let zero = 0 >  (* Coercion to integer, same as (x :> int) *) >  let get x = x >  (* Advance the position by 1, 2, 4 or 8 *) >  let succ x = x + 1 >  let succ2 x = succ (succ x) >  let succ4 x = succ2 (succ2 x) >  let succ8 x = succ4 (succ4 x) >  (* Aliases for shorter type names *) >  type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t1 = ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> (('b, 'a, 'd, 'e, 'f, 'c, 'h, 'i, 'j, 'k, 'l, 'm, 'n, 'g) off * string) >  type ('c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t2 = (z, u, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, 'e, 'f, 'c, 'd, 'i, 'j, 'k, 'l, 'm, 'n, 'g, 'h) off * string) >  type ('g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) t4 = (z, u, z, u, u, u, 'g, 'h, 'i, 'j, 'k, 'l, 'm, 'n) off -> ((z, u, z, u, u, u, 'k, 'l, 'm, 'n, 'g, 'h, 'i, 'j) off * string) >  type t8 = (z, u, z, u, u, u, z, u, u, u, u, u, u, u) off -> ((z, u, z, u, u, u, z, u, u, u, u, u, u, u) off * string) >  (* Take 1, 2, 4 or 8 byte with alignment restriction *) >  let take1 = fun s x -> (succ x, String.sub s (get x) 1);; >  let take2 = fun s x -> (succ2 x, String.sub s (get x) 2);; >  let take4 = fun s x -> (succ4 x, String.sub s (get x) 4);; >  let take8 = fun s x -> (succ8 x, String.sub s (get x) 8);; >  (* Bind all take functions to a string for easier use *) >  let bind s = (take1 s, take2 s, take4 s, take8 s) > end > open M > > (* Test string *) > let s = "aabbccccdddddddd" > let (t1, t2, t4, t8) = bind s > > (* extract things from string *) > let foo () = >  let (off, a) = t1 zero in >  let (off, b) = t1 off in >  let (off, c) = t2 off in >  let (off, d) = t4 off in >  let (off, e) = t8 off in >  Printf.printf "%s %s %s %s %s\n" a b c d e > > (* using take2/4/8 with an offset that isn't aligned gives a compile >   time type error: > let bad () = >  let (off, a) = take1 s zero in >  take8 s off >  *) > > -- > Caml-list mailing list.  Subscription management and archives: > https://sympa-roc.inria.fr/wws/info/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >