From: Yotam Barnoy <yotambarnoy@gmail.com>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: Ocaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] Purity in ocaml
Date: Thu, 23 Jan 2014 13:18:00 -0500 [thread overview]
Message-ID: <CAN6ygOmvKvUFJw3CGqCk40k9h4MVBaMgO1tcbUzBJO0Upa-r5g@mail.gmail.com> (raw)
In-Reply-To: <20140123092009.GA20624@frosties>
[-- Attachment #1: Type: text/plain, Size: 13702 bytes --]
Just as a preface, it's very important to me that this feature remain
relatively simple. I'm looking to incorporate this into the core language,
and I think one of ocaml's strengths is its simplicity. Of course,
simplicity is a relative term in and of itself, but I think one of the
reasons for the success of e.g. OOP is the fact that the concept is simple
and maps naturally to our conception of the world. I see many languages
that drown in their complexity, haskell being a very good example of that.
The cognitive overhead that's involved in programming haskell is not that
different (and may be much more) than programming in C++ IMO. The resulting
code is much more terse and safe, but reading and writing complex systems
might as well involve memory management and class hierarchies. Ocaml hits a
sweet spot (again, IMO) of simplicity, safety and performance, and I'm not
looking to change that.
> Obviously the purity of a higher order function can be dependent on
> the purity of its arguments. You need a syntax for the type of a
> function that can reflect that.
>
>
Well, that's what hpure and hst are for -- feel free to suggest better
names. A pure higher-order function forces its function arguments to all be
pure. An hpure higher-order function does nothing impure, but can become
either st or impure based on what it's called with.
Also note that once we're dealing with 2nd order high level functions, or
functions saved into other types (records, ADTs), we can't infer the type
of the function -- annotations must be used. Fortunately, the majority of
cases involve things like mapping/folding a lambda or an external function
-- all things that can be fully inferred.
> Unfortunately in ocaml you have to annotate a lot if done right. You
> need to create a lot of .mli files, module types or class types. Not
> to mention GADT need type annoation for nearly every function.
>
> So I'm not too sure about keeping this transparent.
>
>
I haven't dealt much with GADTs myself, so I'm not sure if they present a
special challenge.
It might be better to talk about the scope of impurity there. The
> impurity dies with the death of the mutable. Like when you create a
> local hashtbl to speed up computations, get the result and then return
> the result (destroying the hashtbl). The impurity ends there. Global
> mutables are just a special case of having the lifetime of the
> programm.
>
>
Fair enough. This is why the st annotation would exist for
functions/expressions that use the same local mutable state: the
interaction of these statements must retain its ordering, but once the
scope is exited, nothing else can interfere with this local state. Global
state can always be interfered with by other code.
That won't work. You need to be able to declare purity in the source.
> Otherwise, for example, how would you pass a module to a function that
> must have a pure function? (as you say below)
>
>
I'm not sure what you mean. Are you asking for purity annotations in the
syntax? It's always possible to do
let f : pure 'a -> 'a = fun x -> ...
as one option, but I agree that having some shortcut in the syntax would be
useful. I'm open to suggestions. Regarding first class modules, those are
in the same category as higher-order functions that cannot be inferred --
they must be annotated.
> If you can annotate that in the source (.ml or .mli files) then that
> should end up in the .cmi file as well, right. Then why would the
> compiler generated purity, in case you don't have a .mli file, not end
> up in there too?
>
> Note that if a library does not annotate purity but the function
> happens to be pure then you can't have the compiler fill that in. The
> library makes no promise that the function will remain pure in the
> future and by having the compiler adding that promise you would change
> the API of the library. This would break dynamic linking.
>
> So I realy think purity belongs in the .mli and .cmi files. Not
> something seperate. And the compiler can only fill in the purity when
> there is no .mli file. With .mli file it can only check that the given
> type can be unified with the code regarding its purity.
>
>
This is a very good point that I haven't considered before. Inferred
cross-module purity supplies more information than the interface provides.
I can see a few ways to approach this:
1. Your method, which is the simplest method, of requiring all purity
annotations to be in the .mli file. This may make the most sense, but it
also means that to take advantage of the resulting potential optimizations,
a company like Jane Street would have to go back and modify all of its .mli
files, constraining the purity of functions. I really wanted this to be
more of a transparent feature -- one which can help with optimization, but
can also allow for more advanced uses (like forcing purity in monads and
other special cases). However, maybe this is the only 'proper' way to do
this. It would mean that ocaml becomes a fully purity-conscious language.
.mli files are encouraged (for good reason), and if you constrain any part
of your code to purity, any functions called by that code must also be
pure, causing a cascade of purity annotations.
2. Allowing for the method I outlined, which is that impure functions can
be constrained by the compiler even in the presence of impure signatures in
the .mli file. The advantage here is that you don't need to change much in
the code to get potential optimizations. It also doesn't transform ocaml in
any meaningful way -- it simply allows for another style of programming.
The problem is (as you mentioned), what happens if a library changes its
implementation of a function that was not annotated with purity but was
internally pure, and now makes it impure? Well, one approach is that if you
annotated your code with purity, the code should break with a type error.
You obviously had a reason to choose to annotate your code with purity.
Choosing to use an external library's function that was NOT annotated with
purity (but happened to be pure and therefore compiled) was not a very safe
thing to do -- there was no guarantee that the function would stay pure. So
maybe this isn't much of a problem after all.
3. Another approach is to follow a process similar to 2, but to limit it
when it comes to libraries. For example, perhaps libraries should never
ship with/be loaded with the file that supplies function purity metadata.
So a library would be forced to annotate its functions with their purity,
but a regular module would not. I think that would be fair.
Now that I think about it, an external compiler-generated metadata file
containing inferred purity information could be simplified to just modules,
their function names, and those functions' purity level (and an md5 of the
mli/cmi+ml file). We can't infer anything else cross-module anyway -- any
purity information within types would have to be specified by the user.
This could even be appended to the cmi file to ensure that the data is
always up to date. A simple [function;purity] list would be readable even
within a binary file.
I think that would be a mistake. Exceptions are wide spread in ocaml.
> By not annotating types with their exceptions you MAKE them side
> effects because they become unpredictable.
>
> On the other hand if types are annotated with the exceptions they
> throw then they become just another return value and become pure.
> Same input, same exception every time.
>
> True this would complicate the type of functions because the return
> type would be a pair of successfull type and exceptions. You need a
> syntax for the type that allows the return type to be a combination of
> the argument types, like:
>
> val map : ('a -> 'b ['c] ) -> 'a list -> 'b list ['c]
> val mapmap : ('a -> 'b ['c] ) ('b -> 'd ['e] ) -> 'a list -> 'd list ['c |
> 'e]
>
> This will be hairy for purity, too, or even more so. You need ways to
> say that the combined purity is the least pure of the inputs. Or with
> to function arguments the purity is the result of passing one function
> to the other. And so on.
>
> Then again consider this code:
>
> let f () = raise Not_found
> val f : unit -> 'a [Not_found]
>
> let g fn x = try fn x with Not_found -> Empty
> val g : ('a -> 'b [< Not_found as 'c | 'd]) -> 'b [ Empty if 'c | 'd]
>
> g takes a function that may raise Not_found or 'd. G then may itself
> raise Empty (only if the function can raise Not_found though) or 'd.
>
> let h fn1 fn2 x = fn2 fn1 x
> val h : ('a -> 'b [< Not_found as 'c | 'd]) ->
> (('a -> 'b [< Not_found as 'c | 'd]) -> 'b [ Empty if 'c | 'd]) ->
> 'a -> 'b [ Empty if 'c | 'd])
>
> let h2 = h f g
> val h2: 'a -> 'b [Empty]
>
>
This is what I was referring to above with simplicity. I think any attempt
to handle exceptions, if it will succeed, will be overly complex.
Exceptions weren't really made to be a sum type -- they were made to halt
computation and transfer it to an arbitrarily high level in the call stack.
I'm ok with allowing local usage of exceptions within the same function -
analogous to a goto. Anything more than that seems overkill. Think about
using your scheme to map a function over some values: if the function
throws any exception and I want to treat it as pure, I'd have to have a map
function to accept and handle those exceptions or I'm breaking purity.
>
> Much of the stdlib uses exceptions. So you eigther have to rewrite it
> to be pure or it will be too hard to write pure code.
>
>
Should be pretty easy to add some pure functions to stdlib. Same applies to
Core and Batteries. Alternatively, an 'unsafe' function could wrap the
exception-throwing code and translate it to types.
> I like phantom types for that. But that gets hairy with containers.
> E.g. a Hashtbl of strings can be itself mutable or contain mutable
> strings or both. So the type paramters of the hashtbl need phantom
> types too. And thats where my ocaml magic broke down when I tried to
> generalize that.
>
>
Phantom types seem too complex a solution for something as pervasive as
strings. Since one almost never writes to strings, we might be able to pull
off some kind of hack for reusing the string type immutably without
creating an immutable string type.
-Yotam
> On Tue, Jan 21, 2014 at 4:49 AM, Goswin von Brederlow <goswin-v-b@web.de
>wrote:
>
> > On Mon, Jan 20, 2014 at 03:45:15PM -0500, Yotam Barnoy wrote:
> > > I wanted to gauge the interest of people on the list in adding purity
> > > annotations to ocaml. Purity is one of those things that could really
> > help
> > > with reducing memory allocations through deforestation and decreasing
the
> > > running time of programs written in the functional paradigm, and it
could
> > > be very useful for parallelism as well. The basic scheme I have in
mind
> > is
> > > this:
> > >
> > > - Functions that do not access mutable structures would be marked
pure.
> > > - Functions that access only local mutable structures would be marked
as
> > st
> > > (a la state monad)
> >
> > Does local include the arguments passed to the function?
> >
> > > - Functions that access global mutable data would be unmarked (as they
> > are
> > > now).
> > > - Pure functions can call st functions/code so long as all of the
state
> > > referred to by the st code is contained within said pure functions.
> >
> > Because if arguments don't count this makes no sense.
> >
> > But then shouldn't there be another level for functions that don't
> > alter its arguments?
> >
> > > - Functions that call higher order functions, but do not modify
mutable
> > > state would be marked hpure (half-pure). These functions would be
pure so
> > > long as the functions they call remain pure. This allows List.map,
> > > List.fold etc to work for both pure and impure code.
> > > - The same thing exists for st code: hst represents functions that
take
> > > higher order functions but only performs local state mutation.
> > > - In order to take advantage of this mechanism, there's no need to
> > annotate
> > > functions. The type inference algorithm will figure out the strictest
> > type
> > > that can be applied to a function and will save the annotation to an
> > > external, saved annotation file. This means that non-annotated code
can
> > > take advantage of purity without doing any extra work, and the
programmer
> > > never has to think about purity.
> > > - Having the purity annotations as an option is useful to force
certain
> > > parts of the code, such as monads, to be pure.
> > > - An edge case: local state can be made to refer to global state by
some
> > > external function call. Therefore, local state is considered
'polluted'
> > > (and global) if it is passed to an impure function.
> > > - Exceptions: not sure how to handle them yet. The easiest solution
is to
> > > forbid them in st/pure code. Another easy alternative is to only allow
> > > catching them in impure code, as haskell does.
> > >
> > > Thoughts?
> > >
> > > -Yotam
> >
> > 1) What does pure mean? What does it gain you? How do you want to use
it?
> >
> > 2) What syntax do you suggest for annotating functions?
> >
> > 3) Why are exceptions a problem?
> >
> > 4) Will this allow to annotate exceptions too so the compiler can
> > track which exception could be thrown and when all exceptions are
> > caught? If no exception can escape an function then it can be pure
> > again, right?
> >
> > MfG
> > Goswin
MfG
Goswin
--
Caml-list mailing list. Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
[-- Attachment #2: Type: text/html, Size: 17268 bytes --]
next prev parent reply other threads:[~2014-01-23 18:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-20 20:45 Yotam Barnoy
2014-01-20 21:08 ` Siraaj Khandkar
2014-01-20 21:16 ` Yotam Barnoy
2014-01-20 21:31 ` Siraaj Khandkar
2014-01-20 21:43 ` Yotam Barnoy
2014-01-20 22:55 ` Siraaj Khandkar
2014-01-21 1:37 ` Yotam Barnoy
2014-01-21 9:49 ` Goswin von Brederlow
2014-01-21 15:27 ` Yotam Barnoy
2014-01-23 9:20 ` Goswin von Brederlow
2014-01-23 9:35 ` Arnaud Spiwack
2014-01-27 9:32 ` Goswin von Brederlow
2014-01-28 9:21 ` Arnaud Spiwack
2014-02-01 14:52 ` Goswin von Brederlow
2014-02-03 9:20 ` Arnaud Spiwack
2014-01-23 18:18 ` Yotam Barnoy [this message]
2014-01-27 9:46 ` Goswin von Brederlow
2014-01-29 17:16 ` Yotam Barnoy
2014-02-01 15:03 ` Goswin von Brederlow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAN6ygOmvKvUFJw3CGqCk40k9h4MVBaMgO1tcbUzBJO0Upa-r5g@mail.gmail.com \
--to=yotambarnoy@gmail.com \
--cc=caml-list@inria.fr \
--cc=goswin-v-b@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox