Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
* optimization and purity
@ 1999-07-18 18:57 Markus Mottl
  1999-07-22  6:51 ` Diagnostic bug? John Skaller
  0 siblings, 1 reply; 4+ messages in thread
From: Markus Mottl @ 1999-07-18 18:57 UTC (permalink / raw)
  To: OCAML

Hello,

I would like to know whether anyone has already thought about means of
indicating or inferring purity of functions.

My specific problem:

I am just writing a library which features some functions that take
parameters which can only be generated by other functions of this module
(= the parameters have an abstract type).

E.g.:

  let rec foo () = f (make_a a) (make_b b) ... in foo ()

The conversion functions like "make_a" or "make_b" may take a not
insignificant amount of time. I know that "f" will be heavily used within
recursive functions or loops and that the conversion functions are all
pure (=referentially transparent).

So the user can move these expressions outside of loops, because this
will not change semantics. He will even have to do this if he wants that
his code runs fast.

E.g.:

  let made_a = make_a a
  and made_b = make_b b in
  let rec foo () = f made_a made_b ... in foo ()

But it can be tedious for the user to keep writing such code if
"f" is a very common function and if it takes numerous parameters.
Inexperienced users will probably not even know about such optimizations.

Thus, I have thought that it would be an interesting idea to tag all
kinds of external functions so as to indicate whether their evaluation
yields side effects or not.

E.g.:

  external pure make_a : ...

This would permit some very interesting optimizations, because this
information allows the compiler to infer purity for all (better: nearly
all (*)) other functions. In the upper case the user would not need
to do optimization by hand - the compiler can see that e.g. "make_a"
is pure and may, if appropriate (the presence of conditionals can make
things difficult), move its evaluation outside of the loop.

(*) If a function makes use of only pure functions, it is always pure
    itself. If a function contains calls to impure functions, it may be
    that these functions will never be evaluated no matter what the input
    to their "mother function" so this one is still pure as a whole. But
    the latter case is not decidable...


Considering the paragraph on "Optimization" on the page
"http://caml.inria.fr/ocaml/numerical.html", I wonder whether there are
already intentions to implement some other "higher-level" optimizations
in the OCaml-compiler.

For example lambda-lifting would be nice - especially to me, because I
make heavy use of functions-within-functions, be it to restrict their
scope (yields less complex programs) or to bind names to values in the
environment of embracing functions (less parameters needed both for
function definition and for calls).

Small example for problem that could be eliminated with lambda-lifting:

  let slow () =
    let rec f () = () in
    f ()

  let rec f () = ()
  let fast () = f ()

  let _ = for i = 1 to 1000000 do slow () done

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Diagnostic bug?
  1999-07-18 18:57 optimization and purity Markus Mottl
@ 1999-07-22  6:51 ` John Skaller
  1999-07-27 15:34   ` Hendrik Tews
  0 siblings, 1 reply; 4+ messages in thread
From: John Skaller @ 1999-07-22  6:51 UTC (permalink / raw)
  To: OCAML

The following code indicates an incorrect error diagnostic:

type t = T;;

let rec g (x:int) = 
  (let (aT:t) = f x in ())
and f x : unit = ();;

Line 5 characters 17-19:
This expression has type unit but is here used with type t.


The function f is explicitly declared with return type unit,
and clearly returns the unit value: there is no error
in f, and certainly the () is _not_ being used 'here'
with type t. 

The error may be considered to be in the call of f, 
in the function g, or it may be considered to be 
in the explicit declaration of the return type of f,
which clashes with the infered return type,
or even in the whole of f,
but it cannot be in the body of f: f is internally
consistent.

In my actual code, I got an error
in a 5000 character expression, and it took three
days to figure out the error wasn't in that expression
at all. 

I can guess how the problem arises in the inference
algorithm, but now I can't figure out a policy for tracking
down type errors (since they can no longer be trusted).
I found previously that adding explicit types helped
isolate errors, but in this case it didn't.

-------------------------------------------------------
John Skaller    email: skaller@maxtal.com.au
		http://www.maxtal.com.au/~skaller
		phone: 61-2-96600850
		snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Diagnostic bug?
  1999-07-22  6:51 ` Diagnostic bug? John Skaller
@ 1999-07-27 15:34   ` Hendrik Tews
  1999-07-30 14:55     ` John Skaller
  0 siblings, 1 reply; 4+ messages in thread
From: Hendrik Tews @ 1999-07-27 15:34 UTC (permalink / raw)
  To: caml-list

Hi,

as an answer to John I am just telling my heuristics to track
down type errors:

Assuming, that John started with the following program:

    type t = T;;

    let rec g (x) = 
      let (aT : t) = f x in ()
    and f x = ();;

then, he gets the error 

    File "t.ml", line 5, characters 10-12:
    This expression has type unit but is here used with type t

Now this indicates, that there is something wrong with at least
one of ``f'', ``x'', or ``()''. The error occurs, because from
previous uses of the problematic identifiers the compiler
collected type information, which in incompatible with line 5.
Therefore my strategy is to add type annotations at the first use
of the problematic identifiers, which leads to 

    type t = T;;

    let rec g (x) = 
      let (aT : t) = (f : 'a -> unit) x  in ()
    and f x = ();;

or alternatively 

    type t = T;;

    let rec g (x) = 
      let (aT : t) = (f x : unit) in ()
    and f x = ();;

For both programs the compiler generates a usable error message.

John Skaller writes:
   
   In my actual code, I got an error
   in a 5000 character expression, and it took three
   days to figure out the error wasn't in that expression
   at all. 
   
You are right, error reporting is one of the weak points of the
ocaml system. You can find lots of strange examples in error
reporting. My favorite one is option -i, which prints the types
of top level identifiers, if the file can be compiled. Now, in
the situtation, where the inferred types would be most valuable
--a type error in the file-- the compiler prints just nothing :-(


Bye,

Hendrik




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Diagnostic bug?
  1999-07-27 15:34   ` Hendrik Tews
@ 1999-07-30 14:55     ` John Skaller
  0 siblings, 0 replies; 4+ messages in thread
From: John Skaller @ 1999-07-30 14:55 UTC (permalink / raw)
  To: Hendrik Tews, caml-list

At 17:34 27/07/99 +0200, Hendrik Tews wrote:
>Hi,
>
>as an answer to John I am just telling my heuristics to track
>down type errors:

[...]

>Therefore my strategy is to add type annotations at the first use
>of the problematic identifiers, 

	Ah, thank you! This is a good rule. 

>John Skaller writes:
>   
>   In my actual code, I got an error
>   in a 5000 character expression, and it took three
>   days to figure out the error wasn't in that expression
>   at all. 
>   
>You are right, error reporting is one of the weak points of the
>ocaml system. 

	Error reporting is a 'higher order' function
of a compiler: it is the weakest part of almost all
compilers. In fact, I find ocaml error reporting
quite reasonable compared with other systems.

	For example, just try tracking down errors in
complex template instantiations in C++.

-------------------------------------------------------
John Skaller    email: skaller@maxtal.com.au
		http://www.maxtal.com.au/~skaller
		phone: 61-2-96600850
		snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~1999-08-12  9:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-07-18 18:57 optimization and purity Markus Mottl
1999-07-22  6:51 ` Diagnostic bug? John Skaller
1999-07-27 15:34   ` Hendrik Tews
1999-07-30 14:55     ` John Skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox