[Caml-list] How is this type inferred (GADTs)

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* [Caml-list] How is this type inferred (GADTs)
@ 2019-09-20  3:42 Malcolm Matalka
  2019-09-20  6:35 ` Gabriel Scherer
  2019-09-20 14:25 ` Oleg
  0 siblings, 2 replies; 6+ messages in thread
From: Malcolm Matalka @ 2019-09-20  3:42 UTC (permalink / raw)
  To: caml-list

I have been using GADTs to make type-safe versions of APIs that are kind
of like printf.  I've been using this pattern by rote and now I'm
getting to trying to understand how it works.

I have the following code:

module F : sig
  type ('f, 'r) t

  val z : ('r, 'r) t
  val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
end = struct
  type ('f, 'r) t =
    | Z : ('r, 'r) t
    | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t

  let z = Z
  let (//) t f = S (t, f)
end

And the following usage:

utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
- : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>

I understand how 'r is '_weak1 (I think), but I'm still trying to figure
out how 'f gets its type by applying (//).  I think I have an
explanation below, and I'm hoping someone can say if it's correct and/or
simplify it.

Explanation:

The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
The constructor S says that the (_, _) t that it takes has some type 'f,
but that the second type variable must be of the form 'a -> 'r, sort of
deconstructing pattern match on the type (if that makes sense).  And if
that (_, _) t is a Z, where we have stated the two type variables have
the same value, that means 'f must also be ('a -> 'r). So for the
first application of (//) it makes sense.  If we apply (//) again, the
first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
must actually mean '_weak1 in this case is string -> '_weak, and then
this is where things become jumbled in my head.  If the passed in (_, _)
t must be (string -> '_weak), and the inner type actually must be (int32
-> '_weak), then, the inner inner type must be ('_weak), the type of 'r
at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
same type for Z, 'f must also be (string -> int32 -> '_weak), and that
knowledge propagates back up?  But that doesn't feel quite right to me,
either.

With the help of reading other code, I've figured out how to make a
function that uses a type like this that works like kprintf, however
where I'm going in this case is that I want to take a function that
matches 'f and apply it to the result of my functions.

Something like:

  let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
    | Z ->
      f
    | S (t, v) ->
      bind (f (v ())) t

However, this does not compile given:

Error: This expression has type f
       This is not a function; it cannot be applied.

My understanding was if I have Z, and an input that has the type f, then
that is just the return type since at Z, f and r are the same type.  And
than at S, I can peel away the type of f by applying the function f and
then recursively calling.  But my understanding is missing something.

Does this make sense?

What am I not understanding?

Thank you,
/Malcolm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] How is this type inferred (GADTs)
  2019-09-20  3:42 [Caml-list] How is this type inferred (GADTs) Malcolm Matalka
@ 2019-09-20  6:35 ` Gabriel Scherer
  2019-09-20  8:11   ` Malcolm Matalka
  2019-09-20 14:25 ` Oleg
  1 sibling, 1 reply; 6+ messages in thread
From: Gabriel Scherer @ 2019-09-20  6:35 UTC (permalink / raw)
  To: Malcolm Matalka; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 5898 bytes --]

Note: the reason why you get _weak type variables (which are not
polymorphic, just not-inferred-yet)  is that the type-checker cannot detect
that (z // (fun () -> ...)) is a value: it would have to unfold the call to
(//) for this, which it doesn't do in general, and here certainly could not
do given that its definition is hidden behind an abstraction boundary.
I would recommend experimenting *without* the module interface and the
auxiliary function, with the constructors directly, you would get slightly
better types.

# S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
- : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)

Historically we have used module interfaces to implement "phantom types" --
because the type information there is only present in the interface, not in
the definition. With GADTs, the type constraints are built into the
definition itself, which is precisely what makes them more powerful; no
need for an abstract interface on top.

The first part of your question is about understanding how the
type-inference work (how variables are manipulated by the type-checker and
then "propagated back up"). This sounds like a difficult way to understand
GADTs: you want to learn the typing rules *and* the type-inference
algorithm at once. But only the first part is necessary to use the feature
productively (with a few tips on when to use annotations, which are easy to
explain and in fact explained in the manual:
http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead of
asking: "how did the compiler get this type?", I would ask: "why is this
the right type"? I think you could convince yourself that (1) it is a
correct type and (2) any other valid type would be a specialization of this
type, there is no simpler solution.

The second part: you wrote:

  let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
    | Z ->
      f
    | S (t, v) ->
      bind (f (v ())) t

Let's focus on the second clause:

    | S (t, v) ->
      bind (f (v ())) t

we know that (f : f) holds, and that the pattern-matching is on a value of
type (f, r) t, and we must return r.
When pattern-matching on S (t, v), we learn extra type information from the
typing rule of S:

    | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t

if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a))
for *some* unknown/existential type a. So within the branch we have

  bind : type f r. f -> (f, r) t -> r
  f : (f, a -> r) t
  v : unit -> a
  expected return type: r

f does *not* have a function type here, so your idea of applying (f (v ()))
cannot work (v does have a function type, so (v ()) is valid).

The only thing you can do on f is call (bind) recursively (with what
arguments?), and then you will get an (a -> r) as result.

Do you see how to write a correct program from there?

On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> wrote:

> I have been using GADTs to make type-safe versions of APIs that are kind
> of like printf.  I've been using this pattern by rote and now I'm
> getting to trying to understand how it works.
>
> I have the following code:
>
> module F : sig
>   type ('f, 'r) t
>
>   val z : ('r, 'r) t
>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
> end = struct
>   type ('f, 'r) t =
>     | Z : ('r, 'r) t
>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>
>   let z = Z
>   let (//) t f = S (t, f)
> end
>
> And the following usage:
>
> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
>
> I understand how 'r is '_weak1 (I think), but I'm still trying to figure
> out how 'f gets its type by applying (//).  I think I have an
> explanation below, and I'm hoping someone can say if it's correct and/or
> simplify it.
>
> Explanation:
>
> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
> The constructor S says that the (_, _) t that it takes has some type 'f,
> but that the second type variable must be of the form 'a -> 'r, sort of
> deconstructing pattern match on the type (if that makes sense).  And if
> that (_, _) t is a Z, where we have stated the two type variables have
> the same value, that means 'f must also be ('a -> 'r). So for the
> first application of (//) it makes sense.  If we apply (//) again, the
> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
> must actually mean '_weak1 in this case is string -> '_weak, and then
> this is where things become jumbled in my head.  If the passed in (_, _)
> t must be (string -> '_weak), and the inner type actually must be (int32
> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
> knowledge propagates back up?  But that doesn't feel quite right to me,
> either.
>
> With the help of reading other code, I've figured out how to make a
> function that uses a type like this that works like kprintf, however
> where I'm going in this case is that I want to take a function that
> matches 'f and apply it to the result of my functions.
>
> Something like:
>
>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>     | Z ->
>       f
>     | S (t, v) ->
>       bind (f (v ())) t
>
> However, this does not compile given:
>
> Error: This expression has type f
>        This is not a function; it cannot be applied.
>
> My understanding was if I have Z, and an input that has the type f, then
> that is just the return type since at Z, f and r are the same type.  And
> than at S, I can peel away the type of f by applying the function f and
> then recursively calling.  But my understanding is missing something.
>
> Does this make sense?
>
> What am I not understanding?
>
> Thank you,
> /Malcolm
>

[-- Attachment #2: Type: text/html, Size: 7529 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] How is this type inferred (GADTs)
  2019-09-20  6:35 ` Gabriel Scherer
@ 2019-09-20  8:11   ` Malcolm Matalka
  2019-09-20  8:35     ` Gabriel Scherer
  0 siblings, 1 reply; 6+ messages in thread
From: Malcolm Matalka @ 2019-09-20  8:11 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: caml-list


Gabriel Scherer <gabriel.scherer@gmail.com> writes:

> Note: the reason why you get _weak type variables (which are not
> polymorphic, just not-inferred-yet)  is that the type-checker cannot detect
> that (z // (fun () -> ...)) is a value: it would have to unfold the call to
> (//) for this, which it doesn't do in general, and here certainly could not
> do given that its definition is hidden behind an abstraction boundary.
> I would recommend experimenting *without* the module interface and the
> auxiliary function, with the constructors directly, you would get slightly
> better types.
>
> # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
> - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)
>
> Historically we have used module interfaces to implement "phantom types" --
> because the type information there is only present in the interface, not in
> the definition. With GADTs, the type constraints are built into the
> definition itself, which is precisely what makes them more powerful; no
> need for an abstract interface on top.
>
> The first part of your question is about understanding how the
> type-inference work (how variables are manipulated by the type-checker and
> then "propagated back up"). This sounds like a difficult way to understand
> GADTs: you want to learn the typing rules *and* the type-inference
> algorithm at once. But only the first part is necessary to use the feature
> productively (with a few tips on when to use annotations, which are easy to
> explain and in fact explained in the manual:
> http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead of
> asking: "how did the compiler get this type?", I would ask: "why is this
> the right type"? I think you could convince yourself that (1) it is a
> correct type and (2) any other valid type would be a specialization of this
> type, there is no simpler solution.
>
> The second part: you wrote:
>
>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>     | Z ->
>       f
>     | S (t, v) ->
>       bind (f (v ())) t
>
> Let's focus on the second clause:
>
>     | S (t, v) ->
>       bind (f (v ())) t
>
> we know that (f : f) holds, and that the pattern-matching is on a value of
> type (f, r) t, and we must return r.
> When pattern-matching on S (t, v), we learn extra type information from the
> typing rule of S:
>
>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>
> if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a))
> for *some* unknown/existential type a. So within the branch we have
>
>   bind : type f r. f -> (f, r) t -> r
>   f : (f, a -> r) t
>   v : unit -> a
>   expected return type: r
>
> f does *not* have a function type here, so your idea of applying (f (v ()))
> cannot work (v does have a function type, so (v ()) is valid).
>
> The only thing you can do on f is call (bind) recursively (with what
> arguments?), and then you will get an (a -> r) as result.
>
> Do you see how to write a correct program from there?

Ah-ha!  I was close but my thinking was at the "wrong level" (I'm not
sure how to put it).  The working implementation of bind is:

  let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
    | Z ->
      f
    | S (t, v) ->
      let f' = bind f t in
      f' (v ())

And now I see why you wanted me to look at it not through the module
interface.  Looking at how the recursion would work, of course the
most-nested S will have the function corresponding to the first
parameter of the function.  I was thinking that I would consume the
parameters on the way down, but it's actually on the way up.

I am still working out in my head the answer to "why is this the right
type", I think it has to slosh around in there a bit before it truly
makes sense to me.

Thank you!

>
>
>
> On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> wrote:
>
>> I have been using GADTs to make type-safe versions of APIs that are kind
>> of like printf.  I've been using this pattern by rote and now I'm
>> getting to trying to understand how it works.
>>
>> I have the following code:
>>
>> module F : sig
>>   type ('f, 'r) t
>>
>>   val z : ('r, 'r) t
>>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
>> end = struct
>>   type ('f, 'r) t =
>>     | Z : ('r, 'r) t
>>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>>
>>   let z = Z
>>   let (//) t f = S (t, f)
>> end
>>
>> And the following usage:
>>
>> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
>> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
>>
>> I understand how 'r is '_weak1 (I think), but I'm still trying to figure
>> out how 'f gets its type by applying (//).  I think I have an
>> explanation below, and I'm hoping someone can say if it's correct and/or
>> simplify it.
>>
>> Explanation:
>>
>> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
>> The constructor S says that the (_, _) t that it takes has some type 'f,
>> but that the second type variable must be of the form 'a -> 'r, sort of
>> deconstructing pattern match on the type (if that makes sense).  And if
>> that (_, _) t is a Z, where we have stated the two type variables have
>> the same value, that means 'f must also be ('a -> 'r). So for the
>> first application of (//) it makes sense.  If we apply (//) again, the
>> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
>> must actually mean '_weak1 in this case is string -> '_weak, and then
>> this is where things become jumbled in my head.  If the passed in (_, _)
>> t must be (string -> '_weak), and the inner type actually must be (int32
>> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
>> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
>> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
>> knowledge propagates back up?  But that doesn't feel quite right to me,
>> either.
>>
>> With the help of reading other code, I've figured out how to make a
>> function that uses a type like this that works like kprintf, however
>> where I'm going in this case is that I want to take a function that
>> matches 'f and apply it to the result of my functions.
>>
>> Something like:
>>
>>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>>     | Z ->
>>       f
>>     | S (t, v) ->
>>       bind (f (v ())) t
>>
>> However, this does not compile given:
>>
>> Error: This expression has type f
>>        This is not a function; it cannot be applied.
>>
>> My understanding was if I have Z, and an input that has the type f, then
>> that is just the return type since at Z, f and r are the same type.  And
>> than at S, I can peel away the type of f by applying the function f and
>> then recursively calling.  But my understanding is missing something.
>>
>> Does this make sense?
>>
>> What am I not understanding?
>>
>> Thank you,
>> /Malcolm
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] How is this type inferred (GADTs)
  2019-09-20  8:11   ` Malcolm Matalka
@ 2019-09-20  8:35     ` Gabriel Scherer
  2019-09-20  8:56       ` Gabriel Scherer
  0 siblings, 1 reply; 6+ messages in thread
From: Gabriel Scherer @ 2019-09-20  8:35 UTC (permalink / raw)
  To: Malcolm Matalka; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 9450 bytes --]

There is a direct relation between how type evolve during GADT typing and
how queries evolve during logic/relational programming. It can help to
think of a GADT declaration as the set of rules of a Prolog program. (I you
have never looked at Prolog, here is an occasion to learn something new).
That does *not* mean that GADTs let you do prolog-style programming in
types (the compiler will not automatically search for a proof, the user has
to write it down explicitly as a GADT term, so this work is not done
implicitly, as with type-class search for example), but it does give a way
to think about GADT declarations.

Your declaration (notice that I renamed the Z variable, for a good reason
that will appear clear below):

  type (_, _) t =
    | Z : ('f, 'f) t
    | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t

becomes

  t F F.
  t F R :- t F (A -> R), (unit -> A).

In any case, your declaration has the following effect:
- by the Z rule, all types of the form
  ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t
  are inhabited
- by the S rule, you can "hide" an element on the right
  (and you have to provide a thunk for it)
, moving from
  ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t
  to
  ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t

If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an ->
r, r) t.

The way you write about it, it sounds like you had the following
*different* definition in mind

type (_, _) t =
| Z : ('r, 'r) t
| S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t

let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))

corresponding to the following Prolog program that I do find easier to
think about:

  t R R.
  t (A -> F) R :- t F R, (unit -> A).

It turns out that with this other definition, the first implementation that
you had written actually type-check fines.

(You could try to write conversion functions between those two definitions
to convince yourself that they are morally equivalent, but this is actually
a difficult exercise.)

On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com> wrote:

>
> Gabriel Scherer <gabriel.scherer@gmail.com> writes:
>
> > Note: the reason why you get _weak type variables (which are not
> > polymorphic, just not-inferred-yet)  is that the type-checker cannot
> detect
> > that (z // (fun () -> ...)) is a value: it would have to unfold the call
> to
> > (//) for this, which it doesn't do in general, and here certainly could
> not
> > do given that its definition is hidden behind an abstraction boundary.
> > I would recommend experimenting *without* the module interface and the
> > auxiliary function, with the constructors directly, you would get
> slightly
> > better types.
> >
> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)
> >
> > Historically we have used module interfaces to implement "phantom types"
> --
> > because the type information there is only present in the interface, not
> in
> > the definition. With GADTs, the type constraints are built into the
> > definition itself, which is precisely what makes them more powerful; no
> > need for an abstract interface on top.
> >
> > The first part of your question is about understanding how the
> > type-inference work (how variables are manipulated by the type-checker
> and
> > then "propagated back up"). This sounds like a difficult way to
> understand
> > GADTs: you want to learn the typing rules *and* the type-inference
> > algorithm at once. But only the first part is necessary to use the
> feature
> > productively (with a few tips on when to use annotations, which are easy
> to
> > explain and in fact explained in the manual:
> > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead
> of
> > asking: "how did the compiler get this type?", I would ask: "why is this
> > the right type"? I think you could convince yourself that (1) it is a
> > correct type and (2) any other valid type would be a specialization of
> this
> > type, there is no simpler solution.
> >
> > The second part: you wrote:
> >
> >   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
> >     | Z ->
> >       f
> >     | S (t, v) ->
> >       bind (f (v ())) t
> >
> > Let's focus on the second clause:
> >
> >     | S (t, v) ->
> >       bind (f (v ())) t
> >
> > we know that (f : f) holds, and that the pattern-matching is on a value
> of
> > type (f, r) t, and we must return r.
> > When pattern-matching on S (t, v), we learn extra type information from
> the
> > typing rule of S:
> >
> >     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
> >
> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit ->
> a))
> > for *some* unknown/existential type a. So within the branch we have
> >
> >   bind : type f r. f -> (f, r) t -> r
> >   f : (f, a -> r) t
> >   v : unit -> a
> >   expected return type: r
> >
> > f does *not* have a function type here, so your idea of applying (f (v
> ()))
> > cannot work (v does have a function type, so (v ()) is valid).
> >
> > The only thing you can do on f is call (bind) recursively (with what
> > arguments?), and then you will get an (a -> r) as result.
> >
> > Do you see how to write a correct program from there?
>
> Ah-ha!  I was close but my thinking was at the "wrong level" (I'm not
> sure how to put it).  The working implementation of bind is:
>
>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>     | Z ->
>       f
>     | S (t, v) ->
>       let f' = bind f t in
>       f' (v ())
>
> And now I see why you wanted me to look at it not through the module
> interface.  Looking at how the recursion would work, of course the
> most-nested S will have the function corresponding to the first
> parameter of the function.  I was thinking that I would consume the
> parameters on the way down, but it's actually on the way up.
>
> I am still working out in my head the answer to "why is this the right
> type", I think it has to slosh around in there a bit before it truly
> makes sense to me.
>
> Thank you!
>
> >
> >
> >
> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com>
> wrote:
> >
> >> I have been using GADTs to make type-safe versions of APIs that are kind
> >> of like printf.  I've been using this pattern by rote and now I'm
> >> getting to trying to understand how it works.
> >>
> >> I have the following code:
> >>
> >> module F : sig
> >>   type ('f, 'r) t
> >>
> >>   val z : ('r, 'r) t
> >>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
> >> end = struct
> >>   type ('f, 'r) t =
> >>     | Z : ('r, 'r) t
> >>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
> >>
> >>   let z = Z
> >>   let (//) t f = S (t, f)
> >> end
> >>
> >> And the following usage:
> >>
> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
> >>
> >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure
> >> out how 'f gets its type by applying (//).  I think I have an
> >> explanation below, and I'm hoping someone can say if it's correct and/or
> >> simplify it.
> >>
> >> Explanation:
> >>
> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
> >> The constructor S says that the (_, _) t that it takes has some type 'f,
> >> but that the second type variable must be of the form 'a -> 'r, sort of
> >> deconstructing pattern match on the type (if that makes sense).  And if
> >> that (_, _) t is a Z, where we have stated the two type variables have
> >> the same value, that means 'f must also be ('a -> 'r). So for the
> >> first application of (//) it makes sense.  If we apply (//) again, the
> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
> >> must actually mean '_weak1 in this case is string -> '_weak, and then
> >> this is where things become jumbled in my head.  If the passed in (_, _)
> >> t must be (string -> '_weak), and the inner type actually must be (int32
> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
> >> knowledge propagates back up?  But that doesn't feel quite right to me,
> >> either.
> >>
> >> With the help of reading other code, I've figured out how to make a
> >> function that uses a type like this that works like kprintf, however
> >> where I'm going in this case is that I want to take a function that
> >> matches 'f and apply it to the result of my functions.
> >>
> >> Something like:
> >>
> >>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
> >>     | Z ->
> >>       f
> >>     | S (t, v) ->
> >>       bind (f (v ())) t
> >>
> >> However, this does not compile given:
> >>
> >> Error: This expression has type f
> >>        This is not a function; it cannot be applied.
> >>
> >> My understanding was if I have Z, and an input that has the type f, then
> >> that is just the return type since at Z, f and r are the same type.  And
> >> than at S, I can peel away the type of f by applying the function f and
> >> then recursively calling.  But my understanding is missing something.
> >>
> >> Does this make sense?
> >>
> >> What am I not understanding?
> >>
> >> Thank you,
> >> /Malcolm
> >>
>
>

[-- Attachment #2: Type: text/html, Size: 12567 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] How is this type inferred (GADTs)
  2019-09-20  8:35     ` Gabriel Scherer
@ 2019-09-20  8:56       ` Gabriel Scherer
  0 siblings, 0 replies; 6+ messages in thread
From: Gabriel Scherer @ 2019-09-20  8:56 UTC (permalink / raw)
  To: Malcolm Matalka; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 10489 bytes --]

> (You could try to write conversion functions between those two definitions
> to convince yourself that they are morally equivalent,
> but this is actually a difficult exercise.)

type (_, _) t1 =
| Z : ('f, 'f) t1
| S : (('f, 'a -> 'r) t1 * (unit -> 'a)) -> ('f, 'r) t1

type (_, _) t2 =
| Z : ('r, 'r) t2
| S : ('f, 'r) t2 * (unit -> 'a) -> ('a -> 'f, 'r) t2

t2 is t1 "upside down" (and conversely). To go from t1 to t2 is thus a
"reverse" operation.
Just like for lists, writing "reverse" directly is difficult, but one can
use an auxiliary function

let rec rev_append
  : type a b c. (b, c) t2 -> (a, b) t1 -> (a, c) t2


On Fri, Sep 20, 2019 at 10:35 AM Gabriel Scherer <gabriel.scherer@gmail.com>
wrote:

> There is a direct relation between how type evolve during GADT typing and
> how queries evolve during logic/relational programming. It can help to
> think of a GADT declaration as the set of rules of a Prolog program. (I you
> have never looked at Prolog, here is an occasion to learn something new).
> That does *not* mean that GADTs let you do prolog-style programming in
> types (the compiler will not automatically search for a proof, the user has
> to write it down explicitly as a GADT term, so this work is not done
> implicitly, as with type-class search for example), but it does give a way
> to think about GADT declarations.
>
> Your declaration (notice that I renamed the Z variable, for a good reason
> that will appear clear below):
>
>   type (_, _) t =
>     | Z : ('f, 'f) t
>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>
> becomes
>
>   t F F.
>   t F R :- t F (A -> R), (unit -> A).
>
> In any case, your declaration has the following effect:
> - by the Z rule, all types of the form
>   ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t
>   are inhabited
> - by the S rule, you can "hide" an element on the right
>   (and you have to provide a thunk for it)
> , moving from
>   ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t
>   to
>   ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t
>
> If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an
> -> r, r) t.
>
> The way you write about it, it sounds like you had the following
> *different* definition in mind
>
> type (_, _) t =
> | Z : ('r, 'r) t
> | S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t
>
> let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))
>
> corresponding to the following Prolog program that I do find easier to
> think about:
>
>   t R R.
>   t (A -> F) R :- t F R, (unit -> A).
>
> It turns out that with this other definition, the first implementation
> that you had written actually type-check fines.
>
> (You could try to write conversion functions between those two definitions
> to convince yourself that they are morally equivalent, but this is actually
> a difficult exercise.)
>
> On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com>
> wrote:
>
>>
>> Gabriel Scherer <gabriel.scherer@gmail.com> writes:
>>
>> > Note: the reason why you get _weak type variables (which are not
>> > polymorphic, just not-inferred-yet)  is that the type-checker cannot
>> detect
>> > that (z // (fun () -> ...)) is a value: it would have to unfold the
>> call to
>> > (//) for this, which it doesn't do in general, and here certainly could
>> not
>> > do given that its definition is hidden behind an abstraction boundary.
>> > I would recommend experimenting *without* the module interface and the
>> > auxiliary function, with the constructors directly, you would get
>> slightly
>> > better types.
>> >
>> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
>> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)
>> >
>> > Historically we have used module interfaces to implement "phantom
>> types" --
>> > because the type information there is only present in the interface,
>> not in
>> > the definition. With GADTs, the type constraints are built into the
>> > definition itself, which is precisely what makes them more powerful; no
>> > need for an abstract interface on top.
>> >
>> > The first part of your question is about understanding how the
>> > type-inference work (how variables are manipulated by the type-checker
>> and
>> > then "propagated back up"). This sounds like a difficult way to
>> understand
>> > GADTs: you want to learn the typing rules *and* the type-inference
>> > algorithm at once. But only the first part is necessary to use the
>> feature
>> > productively (with a few tips on when to use annotations, which are
>> easy to
>> > explain and in fact explained in the manual:
>> > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So
>> instead of
>> > asking: "how did the compiler get this type?", I would ask: "why is this
>> > the right type"? I think you could convince yourself that (1) it is a
>> > correct type and (2) any other valid type would be a specialization of
>> this
>> > type, there is no simpler solution.
>> >
>> > The second part: you wrote:
>> >
>> >   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>> >     | Z ->
>> >       f
>> >     | S (t, v) ->
>> >       bind (f (v ())) t
>> >
>> > Let's focus on the second clause:
>> >
>> >     | S (t, v) ->
>> >       bind (f (v ())) t
>> >
>> > we know that (f : f) holds, and that the pattern-matching is on a value
>> of
>> > type (f, r) t, and we must return r.
>> > When pattern-matching on S (t, v), we learn extra type information from
>> the
>> > typing rule of S:
>> >
>> >     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>> >
>> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit ->
>> a))
>> > for *some* unknown/existential type a. So within the branch we have
>> >
>> >   bind : type f r. f -> (f, r) t -> r
>> >   f : (f, a -> r) t
>> >   v : unit -> a
>> >   expected return type: r
>> >
>> > f does *not* have a function type here, so your idea of applying (f (v
>> ()))
>> > cannot work (v does have a function type, so (v ()) is valid).
>> >
>> > The only thing you can do on f is call (bind) recursively (with what
>> > arguments?), and then you will get an (a -> r) as result.
>> >
>> > Do you see how to write a correct program from there?
>>
>> Ah-ha!  I was close but my thinking was at the "wrong level" (I'm not
>> sure how to put it).  The working implementation of bind is:
>>
>>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>>     | Z ->
>>       f
>>     | S (t, v) ->
>>       let f' = bind f t in
>>       f' (v ())
>>
>> And now I see why you wanted me to look at it not through the module
>> interface.  Looking at how the recursion would work, of course the
>> most-nested S will have the function corresponding to the first
>> parameter of the function.  I was thinking that I would consume the
>> parameters on the way down, but it's actually on the way up.
>>
>> I am still working out in my head the answer to "why is this the right
>> type", I think it has to slosh around in there a bit before it truly
>> makes sense to me.
>>
>> Thank you!
>>
>> >
>> >
>> >
>> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com>
>> wrote:
>> >
>> >> I have been using GADTs to make type-safe versions of APIs that are
>> kind
>> >> of like printf.  I've been using this pattern by rote and now I'm
>> >> getting to trying to understand how it works.
>> >>
>> >> I have the following code:
>> >>
>> >> module F : sig
>> >>   type ('f, 'r) t
>> >>
>> >>   val z : ('r, 'r) t
>> >>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
>> >> end = struct
>> >>   type ('f, 'r) t =
>> >>     | Z : ('r, 'r) t
>> >>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>> >>
>> >>   let z = Z
>> >>   let (//) t f = S (t, f)
>> >> end
>> >>
>> >> And the following usage:
>> >>
>> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
>> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
>> >>
>> >> I understand how 'r is '_weak1 (I think), but I'm still trying to
>> figure
>> >> out how 'f gets its type by applying (//).  I think I have an
>> >> explanation below, and I'm hoping someone can say if it's correct
>> and/or
>> >> simplify it.
>> >>
>> >> Explanation:
>> >>
>> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
>> >> The constructor S says that the (_, _) t that it takes has some type
>> 'f,
>> >> but that the second type variable must be of the form 'a -> 'r, sort of
>> >> deconstructing pattern match on the type (if that makes sense).  And if
>> >> that (_, _) t is a Z, where we have stated the two type variables have
>> >> the same value, that means 'f must also be ('a -> 'r). So for the
>> >> first application of (//) it makes sense.  If we apply (//) again, the
>> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
>> >> must actually mean '_weak1 in this case is string -> '_weak, and then
>> >> this is where things become jumbled in my head.  If the passed in (_,
>> _)
>> >> t must be (string -> '_weak), and the inner type actually must be
>> (int32
>> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
>> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
>> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
>> >> knowledge propagates back up?  But that doesn't feel quite right to me,
>> >> either.
>> >>
>> >> With the help of reading other code, I've figured out how to make a
>> >> function that uses a type like this that works like kprintf, however
>> >> where I'm going in this case is that I want to take a function that
>> >> matches 'f and apply it to the result of my functions.
>> >>
>> >> Something like:
>> >>
>> >>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>> >>     | Z ->
>> >>       f
>> >>     | S (t, v) ->
>> >>       bind (f (v ())) t
>> >>
>> >> However, this does not compile given:
>> >>
>> >> Error: This expression has type f
>> >>        This is not a function; it cannot be applied.
>> >>
>> >> My understanding was if I have Z, and an input that has the type f,
>> then
>> >> that is just the return type since at Z, f and r are the same type.
>> And
>> >> than at S, I can peel away the type of f by applying the function f and
>> >> then recursively calling.  But my understanding is missing something.
>> >>
>> >> Does this make sense?
>> >>
>> >> What am I not understanding?
>> >>
>> >> Thank you,
>> >> /Malcolm
>> >>
>>
>>

[-- Attachment #2: Type: text/html, Size: 13875 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] How is this type inferred (GADTs)
  2019-09-20  3:42 [Caml-list] How is this type inferred (GADTs) Malcolm Matalka
  2019-09-20  6:35 ` Gabriel Scherer
@ 2019-09-20 14:25 ` Oleg
  1 sibling, 0 replies; 6+ messages in thread
From: Oleg @ 2019-09-20 14:25 UTC (permalink / raw)
  To: mmatalka; +Cc: caml-list


Gabriel has given an excellent explanation, it is hard to add
anything. Yet I will still try.

> module F : sig
>   type ('f, 'r) t
>   val z : ('r, 'r) t
>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
> end = struct
>   type ('f, 'r) t =
>     | Z : ('r, 'r) t
>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>   let z = Z
>   let (//) t f = S (t, f)
> end

Perhaps it will be easier to work out the types if we don't use
GADTs. They are not necessary in your example (and not needed for
the typed printf either).

Here is the adjusted code

module F : sig
  type ('f, 'r) t

  val z : ('r, 'r) t
  val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t

  val bind : 'f -> ('f, 'r) t -> 'r
 end = struct
  type ('f, 'r) t = 'f -> 'r

  let z = fun x -> x
  (* the type annotations are all optional. They are given to show
     which subexpressions have which types, and let the compiler confirm
     that.
     Alas, attaching type annotations requires too many parentheses
  *)  
  let (//) : type a r f. (f, a -> r) t -> (unit -> a) -> (f, r) t =
    fun t th ->
      fun (f:f) -> ((((t f):(a->r)) (th () : a)) : r)

  let bind f t = t f
end


let exp1 = F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
(*
  val exp1 : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
*)

Just for reference, here is the implementation for the opposite order,
suggested by Gabriel.

module G : sig
  type ('f, 'r) t

  val z : ('r, 'r) t
  val (//) : ('f, 'r) t -> (unit -> 'a) -> ('a -> 'f, 'r) t

  val bind : 'f -> ('f, 'r) t -> 'r
 end = struct
  type ('f, 'r) t = {e: 'w. ('r -> 'w) -> ('f -> 'w)}

  let z = {e = fun x -> x}
  (* the type annotations are all optional. They are given to show
     which subexpressions have which types, and let the compiler confirm
     that
     Alas, attaching type annotations requires too many parentheses
  *)  
  let (//) : type a r f. (f, r) t -> (unit -> a) -> (a->f, r) t =
    fun t th ->
      {e = fun (type w) (k:(r->w)) -> fun (af:a->f) ->
        ((((t.e k):(f -> w)) ((af (th ())):f)):w)}

  let bind f t = t.e (fun x -> x) f
end

let test = G.(z // (fun () -> 3l) // (fun () -> "foo"));;
(* val test : (string -> int32 -> '_weak2, '_weak2) G.t = <abstr> *)

;;


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-09-20 14:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-20  3:42 [Caml-list] How is this type inferred (GADTs) Malcolm Matalka
2019-09-20  6:35 ` Gabriel Scherer
2019-09-20  8:11   ` Malcolm Matalka
2019-09-20  8:35     ` Gabriel Scherer
2019-09-20  8:56       ` Gabriel Scherer
2019-09-20 14:25 ` Oleg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox