From: Gabriel Scherer <gabriel.scherer@gmail.com>
To: Malcolm Matalka <mmatalka@gmail.com>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] How is this type inferred (GADTs)
Date: Fri, 20 Sep 2019 10:35:06 +0200 [thread overview]
Message-ID: <CAPFanBFAr5LOEXTdRJtrX6FUQ2_LVjbVgZqiZHLMyE6KkUPL8A@mail.gmail.com> (raw)
In-Reply-To: <86v9tnfkum.fsf@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 9450 bytes --]
There is a direct relation between how type evolve during GADT typing and
how queries evolve during logic/relational programming. It can help to
think of a GADT declaration as the set of rules of a Prolog program. (I you
have never looked at Prolog, here is an occasion to learn something new).
That does *not* mean that GADTs let you do prolog-style programming in
types (the compiler will not automatically search for a proof, the user has
to write it down explicitly as a GADT term, so this work is not done
implicitly, as with type-class search for example), but it does give a way
to think about GADT declarations.
Your declaration (notice that I renamed the Z variable, for a good reason
that will appear clear below):
type (_, _) t =
| Z : ('f, 'f) t
| S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
becomes
t F F.
t F R :- t F (A -> R), (unit -> A).
In any case, your declaration has the following effect:
- by the Z rule, all types of the form
('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t
are inhabited
- by the S rule, you can "hide" an element on the right
(and you have to provide a thunk for it)
, moving from
('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t
to
('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t
If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an ->
r, r) t.
The way you write about it, it sounds like you had the following
*different* definition in mind
type (_, _) t =
| Z : ('r, 'r) t
| S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t
let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))
corresponding to the following Prolog program that I do find easier to
think about:
t R R.
t (A -> F) R :- t F R, (unit -> A).
It turns out that with this other definition, the first implementation that
you had written actually type-check fines.
(You could try to write conversion functions between those two definitions
to convince yourself that they are morally equivalent, but this is actually
a difficult exercise.)
On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com> wrote:
>
> Gabriel Scherer <gabriel.scherer@gmail.com> writes:
>
> > Note: the reason why you get _weak type variables (which are not
> > polymorphic, just not-inferred-yet) is that the type-checker cannot
> detect
> > that (z // (fun () -> ...)) is a value: it would have to unfold the call
> to
> > (//) for this, which it doesn't do in general, and here certainly could
> not
> > do given that its definition is hidden behind an abstraction boundary.
> > I would recommend experimenting *without* the module interface and the
> > auxiliary function, with the constructors directly, you would get
> slightly
> > better types.
> >
> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)
> >
> > Historically we have used module interfaces to implement "phantom types"
> --
> > because the type information there is only present in the interface, not
> in
> > the definition. With GADTs, the type constraints are built into the
> > definition itself, which is precisely what makes them more powerful; no
> > need for an abstract interface on top.
> >
> > The first part of your question is about understanding how the
> > type-inference work (how variables are manipulated by the type-checker
> and
> > then "propagated back up"). This sounds like a difficult way to
> understand
> > GADTs: you want to learn the typing rules *and* the type-inference
> > algorithm at once. But only the first part is necessary to use the
> feature
> > productively (with a few tips on when to use annotations, which are easy
> to
> > explain and in fact explained in the manual:
> > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead
> of
> > asking: "how did the compiler get this type?", I would ask: "why is this
> > the right type"? I think you could convince yourself that (1) it is a
> > correct type and (2) any other valid type would be a specialization of
> this
> > type, there is no simpler solution.
> >
> > The second part: you wrote:
> >
> > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
> > | Z ->
> > f
> > | S (t, v) ->
> > bind (f (v ())) t
> >
> > Let's focus on the second clause:
> >
> > | S (t, v) ->
> > bind (f (v ())) t
> >
> > we know that (f : f) holds, and that the pattern-matching is on a value
> of
> > type (f, r) t, and we must return r.
> > When pattern-matching on S (t, v), we learn extra type information from
> the
> > typing rule of S:
> >
> > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
> >
> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit ->
> a))
> > for *some* unknown/existential type a. So within the branch we have
> >
> > bind : type f r. f -> (f, r) t -> r
> > f : (f, a -> r) t
> > v : unit -> a
> > expected return type: r
> >
> > f does *not* have a function type here, so your idea of applying (f (v
> ()))
> > cannot work (v does have a function type, so (v ()) is valid).
> >
> > The only thing you can do on f is call (bind) recursively (with what
> > arguments?), and then you will get an (a -> r) as result.
> >
> > Do you see how to write a correct program from there?
>
> Ah-ha! I was close but my thinking was at the "wrong level" (I'm not
> sure how to put it). The working implementation of bind is:
>
> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
> | Z ->
> f
> | S (t, v) ->
> let f' = bind f t in
> f' (v ())
>
> And now I see why you wanted me to look at it not through the module
> interface. Looking at how the recursion would work, of course the
> most-nested S will have the function corresponding to the first
> parameter of the function. I was thinking that I would consume the
> parameters on the way down, but it's actually on the way up.
>
> I am still working out in my head the answer to "why is this the right
> type", I think it has to slosh around in there a bit before it truly
> makes sense to me.
>
> Thank you!
>
> >
> >
> >
> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com>
> wrote:
> >
> >> I have been using GADTs to make type-safe versions of APIs that are kind
> >> of like printf. I've been using this pattern by rote and now I'm
> >> getting to trying to understand how it works.
> >>
> >> I have the following code:
> >>
> >> module F : sig
> >> type ('f, 'r) t
> >>
> >> val z : ('r, 'r) t
> >> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
> >> end = struct
> >> type ('f, 'r) t =
> >> | Z : ('r, 'r) t
> >> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
> >>
> >> let z = Z
> >> let (//) t f = S (t, f)
> >> end
> >>
> >> And the following usage:
> >>
> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
> >>
> >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure
> >> out how 'f gets its type by applying (//). I think I have an
> >> explanation below, and I'm hoping someone can say if it's correct and/or
> >> simplify it.
> >>
> >> Explanation:
> >>
> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
> >> The constructor S says that the (_, _) t that it takes has some type 'f,
> >> but that the second type variable must be of the form 'a -> 'r, sort of
> >> deconstructing pattern match on the type (if that makes sense). And if
> >> that (_, _) t is a Z, where we have stated the two type variables have
> >> the same value, that means 'f must also be ('a -> 'r). So for the
> >> first application of (//) it makes sense. If we apply (//) again, the
> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
> >> must actually mean '_weak1 in this case is string -> '_weak, and then
> >> this is where things become jumbled in my head. If the passed in (_, _)
> >> t must be (string -> '_weak), and the inner type actually must be (int32
> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
> >> knowledge propagates back up? But that doesn't feel quite right to me,
> >> either.
> >>
> >> With the help of reading other code, I've figured out how to make a
> >> function that uses a type like this that works like kprintf, however
> >> where I'm going in this case is that I want to take a function that
> >> matches 'f and apply it to the result of my functions.
> >>
> >> Something like:
> >>
> >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
> >> | Z ->
> >> f
> >> | S (t, v) ->
> >> bind (f (v ())) t
> >>
> >> However, this does not compile given:
> >>
> >> Error: This expression has type f
> >> This is not a function; it cannot be applied.
> >>
> >> My understanding was if I have Z, and an input that has the type f, then
> >> that is just the return type since at Z, f and r are the same type. And
> >> than at S, I can peel away the type of f by applying the function f and
> >> then recursively calling. But my understanding is missing something.
> >>
> >> Does this make sense?
> >>
> >> What am I not understanding?
> >>
> >> Thank you,
> >> /Malcolm
> >>
>
>
[-- Attachment #2: Type: text/html, Size: 12567 bytes --]
next prev parent reply other threads:[~2019-09-20 8:33 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-20 3:42 Malcolm Matalka
2019-09-20 6:35 ` Gabriel Scherer
2019-09-20 8:11 ` Malcolm Matalka
2019-09-20 8:35 ` Gabriel Scherer [this message]
2019-09-20 8:56 ` Gabriel Scherer
2019-09-20 14:25 ` Oleg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPFanBFAr5LOEXTdRJtrX6FUQ2_LVjbVgZqiZHLMyE6KkUPL8A@mail.gmail.com \
--to=gabriel.scherer@gmail.com \
--cc=caml-list@inria.fr \
--cc=mmatalka@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox