Re: Stdlib regularity

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* Re: Stdlib regularity
@ 1999-10-12 16:21 Damien Doligez
  1999-10-13 12:18 ` Matías Giovannini
  0 siblings, 1 reply; 22+ messages in thread
From: Damien Doligez @ 1999-10-12 16:21 UTC (permalink / raw)
  To: caml-list

>From: =?iso-8859-1?Q?Mat=EDas?= Giovannini <matias@k-bell.com>

>There is no way to pre-load a prelude file in the interpreter without
>relinking a custom runtime, is it?

If you mean the toplevel system, there is, and it's in the manual:

(from <http://caml.inria.fr/ocaml/htmlman/node9.html>)

>On start-up (before the first phrase is read), if the file .ocamlinit
>exists in the current directory, its contents are read as a sequence
>of Objective Caml phrases and executed as per the #use directive
>described in section 9.2. The evaluation outcode for each phrase are
>not displayed. 

-- Damien




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-12 16:21 Stdlib regularity Damien Doligez
@ 1999-10-13 12:18 ` Matías Giovannini
  0 siblings, 0 replies; 22+ messages in thread
From: Matías Giovannini @ 1999-10-13 12:18 UTC (permalink / raw)
  To: caml-list; +Cc: Damien Doligez

Damien Doligez wrote:
> 
> >From: =?iso-8859-1?Q?Mat=EDas?= Giovannini <matias@k-bell.com>
> 
> >There is no way to pre-load a prelude file in the interpreter without
> >relinking a custom runtime, is it?
> 
> If you mean the toplevel system, there is, and it's in the manual:
> 
> (from <http://caml.inria.fr/ocaml/htmlman/node9.html>)
> 
> >On start-up (before the first phrase is read), if the file .ocamlinit
> >exists in the current directory, its contents are read as a sequence
> >of Objective Caml phrases and executed as per the #use directive
> >described in section 9.2. The evaluation outcode for each phrase are
> >not displayed.
> 
> -- Damien

Wow, I always thought it only worked on UNIX, but it actually does work
on MacOS too.

Thanks for pointing this out.

-- 
I got your message. I couldn't read it. It was a cryptogram.
-- Laurie Anderson




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
@ 1999-10-09 16:58 William Chesters
  1999-10-10  0:11 ` Matías Giovannini
  0 siblings, 1 reply; 22+ messages in thread
From: William Chesters @ 1999-10-09 16:58 UTC (permalink / raw)
  To: caml-list

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 162 bytes --]

Matías Giovannini wrote:
 > And then a "functional for" loop looks like
 > 
 > List.map (fun i -> ...) (iota n)

What about Array.init n (fun i -> ...) ?  :-)




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-09 16:58 William Chesters
@ 1999-10-10  0:11 ` Matías Giovannini
  0 siblings, 0 replies; 22+ messages in thread
From: Matías Giovannini @ 1999-10-10  0:11 UTC (permalink / raw)
  To: caml-list

William Chesters wrote:
> 
> Matías Giovannini wrote:
>  > And then a "functional for" loop looks like
>  >
>  > List.map (fun i -> ...) (iota n)
> 
> What about Array.init n (fun i -> ...) ?  :-)

How un-ref-trans! :-)

-- 
I got your message. I couldn't read it. It was a cryptogram.
-- Laurie Anderson




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Stdlib regularity
@ 1999-10-06 13:25 Ohad Rodeh
  1999-10-06 16:18 ` Markus Mottl
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Ohad Rodeh @ 1999-10-06 13:25 UTC (permalink / raw)
  To: caml-list

Caml list,
  I have used OCaml extensively in the past few years, and I've had
some misgivings about the CAML standard library argument ordering. It
is a little bit confusing and not standard. For example:

	  val Queue.add: 'a -> 'a t -> unit 
	  val Hashtbl.add: ('a,'b) t -> 'a -> 'b -> unit

My general suggestion is to always make the first argument the <'a t>
type and the second the <'a> type. The only exception to this rule
should be functionals, for example, in Queue:

	 val iter: ('a -> unit) -> 'a t -> unit

I've summed up the proposed changes in an order of importance, please
remember that this is suggestion based on my personal taste alone. 

The changes I'm most interested are:
Module Queue: 
  switch:   val add: 'a -> 'a t -> unit 
  to:       val add: 'a t -> 'a -> unit 

Module Stack: 
  switch: val push: 'a -> 'a t -> unit
  to:     val push: 'a t -> 'a -> unit

Module Stream: 
  switch: npeek : int -> 'a t -> 'a list;;
  to:     npeek : 'a t -> int -> 'a list;;

This make the data-structure modules (Hashtbl,Queue,Stack,Stream) behave 
the same. 

If this is possible, I'd like this to apply to the Map and Set
modules. For module Map, this is the current signature:

module type OrderedType =
  sig
    type t
    val compare: t -> t -> int
  end

module type S = 
  sig
    type key

    type 'a t

    val empty: 'a t

    val add: key -> 'a -> 'a t -> 'a t

    val find: key -> 'a t -> 'a

    val remove: key -> 'a t -> 'a t

    val mem:  key -> 'a t -> bool

    val iter: (key -> 'a -> unit) -> 'a t -> unit

    val map: ('a -> 'b) -> 'a t -> 'b t

    val fold: (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b

   end

  end
module Make(Ord: OrderedType): (S with type key = Ord.t)


I'd rather make it: 

module type S = 
  sig
    type key

    type 'a t

    val empty: 'a t

    val add: 'a t -> key -> 'a -> 'a t

    val find: 'a t -> key -> 'a

    val remove: 'a t -> key -> 'a t

    val mem:  'a t -> key -> bool

    val iter: (key -> 'a -> unit) -> 'a t -> unit

    val map: ('a -> 'b) -> 'a t -> 'b t

    val fold: (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b

   end




For module Set, this is the current signature:


module type OrderedType =
  sig
    type t
    val compare: t -> t -> int
  end

module type S =
  sig
    type elt

    type t

    val empty: t

    val is_empty: t -> bool

    val mem: elt -> t -> bool

    val add: elt -> t -> t

    val singleton: elt -> t

    val remove: elt -> t -> t

    val union: t -> t -> t
    val inter: t -> t -> t
    val diff: t -> t -> t

    val compare: t -> t -> int

    val equal: t -> t -> bool

    val subset: t -> t -> bool

    val iter: (elt -> unit) -> t -> unit

    val fold: (elt -> 'a -> 'a) -> t -> 'a -> 'a

    val cardinal: t -> int

    val elements: t -> elt list

    val min_elt: t -> elt

    val max_elt: t -> elt

    val choose: t -> elt

  end
module Make(Ord: OrderedType): (S with type elt = Ord.t)


Id rather switch S to: 
module type S =
  sig
    type elt

    type t

    val empty: t

    val is_empty: t -> bool

    val mem: t -> elt -> bool

    val add: t -> elt -> t

    val singleton: elt -> t

    val remove: t -> elt -> t

    val union: t -> t -> t
    val inter: t -> t -> t
    val diff: t -> t -> t

    val compare: t -> t -> int

    val equal: t -> t -> bool

    val subset: t -> t -> bool

    val iter: (elt -> unit) -> t -> unit

    val fold: (elt -> 'a -> 'a) -> t -> 'a -> 'a

    val cardinal: t -> int

    val elements: t -> elt list

    val min_elt: t -> elt

    val max_elt: t -> elt

    val choose: t -> elt

  end


Module List has some of the same functions (mem,remove), so my
suggestion is: 

switch: 
	val mem : 'a -> 'a list -> bool
	val memq : 'a -> 'a list -> bool
to:
	val mem : 'a list -> 'a -> bool
	val memq : 'a list -> 'a -> bool


switch: 
	val assoc : 'a -> ('a * 'b) list -> 'b
	val assq : 'a -> ('a * 'b) list -> 'b
	val mem_assoc : 'a -> ('a * 'b) list -> bool
	val mem_assq : 'a -> ('a * 'b) list -> bool
	val remove_assoc : 'a -> ('a * 'b) list -> ('a * 'b) list
	val remove_assq : 'a -> ('a * 'b) list -> ('a * 'b) list

to: 
        val assoc : ('a * 'b) list -> 'a -> 'b
	val assq : ('a * 'b) list -> 'a-> 'b
	val mem_assoc : ('a * 'b) list -> 'a -> bool
	val mem_assq : ('a * 'b) list -> 'a-> bool
	val remove_assoc : ('a * 'b) list -> 'a -> ('a * 'b) list
	val remove_assq : ('a * 'b) list -> 'a -> ('a * 'b) list

The more important changes are the first minor 3, the rest are
optional. What do you think? 

	Ohad.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-06 13:25 Ohad Rodeh
@ 1999-10-06 16:18 ` Markus Mottl
  1999-10-08 14:06   ` Matías Giovannini
  1999-10-08 14:10   ` skaller
  1999-10-06 18:50 ` John Prevost
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 22+ messages in thread
From: Markus Mottl @ 1999-10-06 16:18 UTC (permalink / raw)
  To: Ohad Rodeh; +Cc: OCAML

>   I have used OCaml extensively in the past few years, and I've had
> some misgivings about the CAML standard library argument ordering. It
> is a little bit confusing and not standard. For example:

Although the standard library is quite ok, there are some (minor)
inconsistencies. What concerns my wishes for it, I'd love to see more
features (= functions or even modules).

At the moment I am not always linking against the standard library,
but I use own modules, which I have extended a bit, because I
need some important features all of the time (e.g. why is there no
"partition"-function in the set-module?).

What do you think about this proposal: why not put a version of the
standard library on the CVS-server of INRIA, where volunteers can
contribute extensions, replacements, new modules, etc.?

>From time to time, the maintainers of OCaml might want to take a look at
the diffs to the original library and merge some (all?) of the goodies
into the main branch. I can imagine that you have a lot of patches which
are just waiting to be uploaded...

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-06 16:18 ` Markus Mottl
@ 1999-10-08 14:06   ` Matías Giovannini
  1999-10-10 20:09     ` Pierre Weis
  1999-10-08 14:10   ` skaller
  1 sibling, 1 reply; 22+ messages in thread
From: Matías Giovannini @ 1999-10-08 14:06 UTC (permalink / raw)
  To: OCAML

Markus Mottl wrote:
> 
> >   I have used OCaml extensively in the past few years, and I've had
> > some misgivings about the CAML standard library argument ordering. It
> > is a little bit confusing and not standard. For example:
> 
> Although the standard library is quite ok, there are some (minor)
> inconsistencies. What concerns my wishes for it, I'd love to see more
> features (= functions or even modules).

Yes! Yes! I always begin my Caml code by writing iota, and I wish it
were included in the standard library. It's silly simple, and imprescindible.

let iota n =
    let rec aux l n =
        if n > 0 then aux (n::l) (n-1) else l
    in aux [] n

And then a "functional for" loop looks like

List.map (fun i -> ...) (iota n)

(Incidentaly, the name "iota" comes from APL, and stands for the
"Initial natural Interval".)

-- 
I got your message. I couldn't read it. It was a cryptogram.
-- Laurie Anderson






^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-08 14:06   ` Matías Giovannini
@ 1999-10-10 20:09     ` Pierre Weis
  1999-10-10 20:12       ` Matías Giovannini
  0 siblings, 1 reply; 22+ messages in thread
From: Pierre Weis @ 1999-10-10 20:09 UTC (permalink / raw)
  To: matias; +Cc: caml-list

> Yes! Yes! I always begin my Caml code by writing iota, and I wish it
> were included in the standard library. It's silly simple, and imprescindible.
> 
> let iota n =
>     let rec aux l n =
>         if n > 0 then aux (n::l) (n-1) else l
>     in aux [] n

We may reuse this ``prelude'' code that slightly generalizes iota (named
range in this version of the standard library):

(*\
\subsection{Lists of consecutive integers}
\index{Lists of consecutive integers}
\begin{caml_primitive}
interval
range
\end{caml_primitive}
\begin{itemize}
\item \verb"interval n m" returns the list of all integers in
increasing order, from \verb"n" to \verb"m".
\item \verb"range n" gives the first n integers.
\end{itemize}
\*)

let interval n m =
 let rec loop l m =
  if n > m then l else loop (m :: l) (pred m) in
 loop [] m;;

let range = interval 1;;

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-10 20:09     ` Pierre Weis
@ 1999-10-10 20:12       ` Matías Giovannini
  0 siblings, 0 replies; 22+ messages in thread
From: Matías Giovannini @ 1999-10-10 20:12 UTC (permalink / raw)
  Cc: caml-list

Pierre Weis wrote:
> 
> > Yes! Yes! I always begin my Caml code by writing iota, and I wish it
> > were included in the standard library. It's silly simple, and imprescindible.
> We may reuse this ``prelude'' code that slightly generalizes iota (named
> range in this version of the standard library):

There is no way to pre-load a prelude file in the interpreter without
relinking a custom runtime, is it?

-- 
I got your message. I couldn't read it. It was a cryptogram.
-- Laurie Anderson




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-06 16:18 ` Markus Mottl
  1999-10-08 14:06   ` Matías Giovannini
@ 1999-10-08 14:10   ` skaller
  1999-10-08 19:21     ` Markus Mottl
  1999-10-09 21:14     ` Dave Mason
  1 sibling, 2 replies; 22+ messages in thread
From: skaller @ 1999-10-08 14:10 UTC (permalink / raw)
  To: Markus Mottl; +Cc: Ohad Rodeh, OCAML

Markus Mottl wrote:
> What do you think about this proposal: why not put a version of the
> standard library on the CVS-server of INRIA, where volunteers can
> contribute extensions, replacements, new modules, etc.?

I think this is a good idea, but I think that implementing 
extensions and new modules is relatively easy (in most cases)
but deciding on the best interfaces is not.

Even the best library designer cannot make decision without
user feedback. As a member of two ISO Working Groups, my experience
is that there is something to be said for proposing changes in a
slightly formal
manner, followed by discussion of the proposal. 

There is another advantage of this approach, which is that the 'offical'
ocaml developers can indicate tentative support for some changes,
allowing users to try them out with modules with the same interface
but potentially less efficient implementations, to gain experience
with these interfaces, and to write code that will work well with the
next official release.

-- 
John Skaller, mailto:skaller@maxtal.com.au
1/10 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
downloads: http://www.triode.net.au/~skaller

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-08 14:10   ` skaller
@ 1999-10-08 19:21     ` Markus Mottl
  1999-10-09 21:14     ` Dave Mason
  1 sibling, 0 replies; 22+ messages in thread
From: Markus Mottl @ 1999-10-08 19:21 UTC (permalink / raw)
  To: skaller; +Cc: OCAML

> I think this is a good idea, but I think that implementing 
> extensions and new modules is relatively easy (in most cases)
> but deciding on the best interfaces is not.

Deciding on interfaces is probably the most important (and difficult)
problem in library design. Finding suitable (=intuitive) names for
functions alone can be a difficult task.

No sensible contributor would check in changes that effect the interface
without discussing this with other developers/contributors before. So if
there is a good means of communication (e.g. a specialized mailing list)
and if people adhere to strict rules concerning check-in, I am quite
convinced that this would work well.

Regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-08 14:10   ` skaller
  1999-10-08 19:21     ` Markus Mottl
@ 1999-10-09 21:14     ` Dave Mason
  1 sibling, 0 replies; 22+ messages in thread
From: Dave Mason @ 1999-10-09 21:14 UTC (permalink / raw)
  To: OCAML

>>>>> On Sat, 09 Oct 1999 00:10:48 +1000, skaller <skaller@maxtal.com.au> said:

> Markus Mottl wrote:
>> What do you think about this proposal: why not put a version of the
>> standard library on the CVS-server of INRIA, where volunteers can
>> contribute extensions, replacements, new modules, etc.?

> I think this is a good idea, but I think that implementing
> extensions and new modules is relatively easy (in most cases) but
> deciding on the best interfaces is not.

> Even the best library designer cannot make decision without user
> feedback. As a member of two ISO Working Groups, my experience is
> that there is something to be said for proposing changes in a
> slightly formal manner, followed by discussion of the proposal.

For another take on this, see the SRFI process that has been in use
for the last year in the Scheme community.  http://srfi.schemers.org/

Things are a bit different in caml since there is really only one
implementation organization (although I guess there is ocaml and
caml).

../Dave




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-06 13:25 Ohad Rodeh
  1999-10-06 16:18 ` Markus Mottl
@ 1999-10-06 18:50 ` John Prevost
  1999-10-07  7:33 ` skaller
  1999-10-07  9:18 ` Francisco Valverde Albacete
  3 siblings, 0 replies; 22+ messages in thread
From: John Prevost @ 1999-10-06 18:50 UTC (permalink / raw)
  To: Ohad Rodeh; +Cc: caml-list

Ohad Rodeh <orodeh@cs.cornell.edu> writes:

>   I have used OCaml extensively in the past few years, and I've had
> some misgivings about the CAML standard library argument ordering. It
> is a little bit confusing and not standard. For example:
> 
> 	  val Queue.add: 'a -> 'a t -> unit 
> 	  val Hashtbl.add: ('a,'b) t -> 'a -> 'b -> unit
> 
> My general suggestion is to always make the first argument the <'a t>
> type and the second the <'a> type. The only exception to this rule
> should be functionals, for example, in Queue:
> 
> 	 val iter: ('a -> unit) -> 'a t -> unit

I have a slightly different proposal, but one which is along the same
lines:

Standard ordering is:

val ho_func : ('a -> unit) -> 'a t -> unit
val ho_func : ('a -> 'b) -> 'a t -> 'b t
val imp_func : 'a t -> 'a -> unit
val func : 'a -> 'a t -> 'a t

Rationale:

The basic rationale is that the most often-repeated item should be at
the front to make it easier to curry.  Along with this, there's the
desire to have a consistent style for ordering arguments.  The basic
style I propose is that for functions acting on some sort of agregate
data type the "importance" of the arguments is as follows:

Any higher-order function gets its function arguments first.  (Idea:
the function argument determines the meaning of the function.  Hence
it should be closer to the function than the other arguments.
Corollary: non-function arguments that determine the meaning of a
function should also bind closely.)

In an imperative case, the agregate argument should come next, after
any behavior-determining arguments, but before any single values.
(Idea: in an imperative case, the value "acted upon" is more important
than the value used in the action--sort of like direct object
vs. indirect object.)

i.e.

give john pizza ==> john is the "aggregate", pizza is the value used
                    in the action.

In the non-imperative case, the "value" place should come before the
aggregate case--this is because we're no longer "acting on" something.
Now that we're not, the value determines the meaning of the function
which is applied to the aggregate, returning a new aggregate.  In
essence, the function should be thought of in this case as taking an
argument and returning a new function, like map does, rather than
acting on something after receiving multiple arguments.

So, the basic ordering:

val func         : determiners   -> arguments              -> result

val Queue.add    :                  'a t      -> 'a        -> unit        +
val Hashtbl.add  :                  ('a,'b) t -> 'a -> 'b  -> unit        +
val Queue.iter   : ('a -> unit)  -> 'a t                   -> unit
val Stream.npeek : int           -> 'a t                   -> 'a list     *
val S.add        : key -> 'a     -> 'a t                   -> 'a t        *
val S.find       :                  'a t -> key            -> 'a          *
val S.remove     : key           -> 'a t                   -> 'a t        *
val S.mem        : key           -> 'a t                   -> bool        *

...

The * is where I disagree with Ohad's strategy.  The + is where I
disagree with O'Caml's.  As a note, O'Caml's strategy makes imperative
functions order arguments more like pure functions.  This may make the
default currying order less useful, but is probably better than my
strategy.

Hence, O'Caml's order is pretty good.  :)

John.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-06 13:25 Ohad Rodeh
  1999-10-06 16:18 ` Markus Mottl
  1999-10-06 18:50 ` John Prevost
@ 1999-10-07  7:33 ` skaller
  1999-10-07  9:18 ` Francisco Valverde Albacete
  3 siblings, 0 replies; 22+ messages in thread
From: skaller @ 1999-10-07  7:33 UTC (permalink / raw)
  To: Ohad Rodeh; +Cc: caml-list

Ohad Rodeh wrote:
> 
> Caml list,
>   I have used OCaml extensively in the past few years, and I've had
>  some misgivings about the CAML standard library argument ordering.
>  What do you think?

While I tend to agree with the sentiment (and, apart from just
remembering
the ordering, the most likely Currying isn't possible when the data
structure
type isn't the first argument), it would create a compatibility
problem to just change the existing library, and a mess to add a new set
of modules just to 'fix' the argument order to be slightly more
intuitive.
I guess the original reasoning was more to do with reading order:

	List.mem element theList

reads well as

	element <is member of> theList

This kind of issue (argument order) was much more important in C++,
where generics represented by templates _mandate_ consistency
(in the naming conventions as well). Even before the STL was finalised,
it was being used enough that people argued against changing it
to avoid breaking code.

It is probably more important to consider how to introduce
FISh 2 style polymorphism, in which functions like 'map'
and 'iter' can be applied to _any_ data structure.
In that case, you gain consistency automatically, since there
is _really_ only one such algorithm for all data structures :-)

A more limited way to achieve this is to use an STL style
library; that is, iterator based algorithms, with the client
supplying the iterators.

-- 
John Skaller, mailto:skaller@maxtal.com.au
1/10 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
downloads: http://www.triode.net.au/~skaller

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-06 13:25 Ohad Rodeh
                   ` (2 preceding siblings ...)
  1999-10-07  7:33 ` skaller
@ 1999-10-07  9:18 ` Francisco Valverde Albacete
  1999-10-08 14:56   ` skaller
  3 siblings, 1 reply; 22+ messages in thread
From: Francisco Valverde Albacete @ 1999-10-07  9:18 UTC (permalink / raw)
  To: Ohad Rodeh; +Cc: caml-list

Ohad Rodeh wrote:

> Caml list,
>   I have used OCaml extensively in the past few years, and I've had
> some misgivings about the CAML standard library argument ordering. It
> is a little bit confusing and not standard. For example:

Greetings, everybody,

I have my opinion about argument order too! However, I have also read the
proposals by other people and I guess all of us have their bit of reason.
C'mon, even the implementors of the language and its their own child!

Now, the work of the OLabl group is much more daring and advanced than any
of the proposals *we* can make: It allows free order of parameters (with
certain restrictions). It involves also named parameters, missing arguments
and default values (which can deal more elegantly with obnoxious default
comparison functions for aggregate types like sets and priority queues). It
can also accept OCaml-like function types. Take a look at
http://pauillac.inria.fr/olabl

Alas, for one of particular "features" above mentioned (I don't remember
which) there is a time penalty in runtime, although minor if I remember well
(Any explanations from J.Garrigue or J.P. Furuse)? But we never got into
OCaml strictly for time-efficiency reasons, did we?

The most important consideration is that the OLabl people are not *the OCaml
people* and their work remains different (if faithful to releases of the
original language. I guess both teams are closely related?). And that has
always been my main deterrent for not adhering to OLabl: I use all its tools
*except* this particular feature with the types of functions, for fear I'll
be left out of the main OCaml tool development.

Any gues what comes next? *YES* can I ask the OCaml implemtors if there are
any plans to consider implementing OLabl's typing discipline/suggestions as
an alternative to OCaml's more rigid one while maintaining the present
scheme for compatibility's sake?

Thanks for your attention!

        Francisco Valverde
        Universidad Carlos III de Madrid

Resume' en francais:

Pourquoi ne profite-t-on pas des advantages de la discipline de typage de
OLabl dans OCaml? Merci.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-07  9:18 ` Francisco Valverde Albacete
@ 1999-10-08 14:56   ` skaller
  1999-10-09 22:26     ` Francois Rouaix
  0 siblings, 1 reply; 22+ messages in thread
From: skaller @ 1999-10-08 14:56 UTC (permalink / raw)
  To: Francisco Valverde Albacete; +Cc: Ohad Rodeh, caml-list

Francisco Valverde Albacete wrote:
> But we never got into
> OCaml strictly for time-efficiency reasons, did we?

	In my case, no, but it is still vital, otherwise
I'd use Haskell instead :-)

	In particular, because procedural algorithms can be
implemented in ocaml just like in C++, the same complexity
_should_ be obtainable, and work on optimisation plus
standard library support should allow 'close' to
unit performance factors.

	There are several cases where the core language prevents
this, because it lacks functionality available in C++: the ability
to create uninitialised values, and the ability to destroy them
are two that I've become aware of trying to build a variable length
array module. Both these features seem mandatory. Both are unsafe.
A reasonable compromise might be:

	do not implement the features in the standard bytecode interpreter
	do not allow these features in the compiler unless a special switch is
specified

Unfortunately, I do not think it is 'enough' to provide a variable
length array in the standard library, because that is only one
data structure which cannot be implemented efficiently or cleanly
without these features.

	There is another more serious problem: ocaml doesn't
handle recursive types well. I'm not sure I fully understand this.
When all values are boxed, all recursion of algebraic types is sound. 
[Proof: it is sound in C, where all non datum values are represented
by pointers; note that _initialising_ the structure may not be possible
in ocaml unless uninitialised values are permitted]

	It is not clear to me if the result extends to modules.
However, the lack of recursion across module boundaries is also 
pain. In trying to implement a variable length array, I found
that I needed to work around the inability to create an uninitialised
array by using a functor whose argument supplied the array value
type and special value of that type to initialise the array with.

	Unfortunately, the type of the instantiated functor's 
aggregate component was a variant of the type over which it
needed to be instantiated. So this solution cannot work.
To exhibit the problem more plainly, it is something like:

	type t' = X | Y of G.t
	where module G = V.Make(sig type t = t'; val default = X)

[In the actual code, the type of a python expression, 'expr', includes
a case Initial suitable for initialising an array, and a case
'V of expr varray', where varray is a variable length array of 
expressions, this works as is, but I cannot make varray from a
functor that needs t = expr and default = Initial; even if
the recursion could work, there is no syntax for it]
	
-- 
John Skaller, mailto:skaller@maxtal.com.au
1/10 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
downloads: http://www.triode.net.au/~skaller




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-08 14:56   ` skaller
@ 1999-10-09 22:26     ` Francois Rouaix
  1999-10-10  5:38       ` skaller
  0 siblings, 1 reply; 22+ messages in thread
From: Francois Rouaix @ 1999-10-09 22:26 UTC (permalink / raw)
  To: skaller; +Cc: caml-list

<skaller@maxtal.com.au> writes:

  skaller> 	There are several cases where the core language prevents
  skaller> this, because it lacks functionality available in C++: the
  skaller> ability to create uninitialised values, and the ability to
  skaller> destroy them are two that I've become aware of trying to build
  skaller> a variable length array module.

Uninitialized values are easily implemented with the 'a option type.
Of course, the code is then ugly, because you have to match your
values everywhere to None | Some x. In C/C++ terms, this forces you
to check for NULL pointers systematically, which is a Really Good Thing.
Adding uninitialised values is a major source of bugs, and it's kind
of natural to pay the price for it in the readability of the source,
if you want your code to be robust.

  skaller> [About a functor for an extensible array type, and the problem
  skaller> of a dummy value]
Builtin arrays require you to provide a initialization value, even for
zero-length array. Why don't you carry the same requirement to your
extensible arrays, and simply use a polymorphic type:

type 'a earray = {
     mutable current : 'a array;
     mutable used : int;
     }

let create n i = { current = Array.create n i; used = n }

And then, if you want to have the equivalent of NULL pointers, use None,
and option types everywhere.

It seems to me that you are trying to force the language to do something
it has been purposely designed against. I'm not sure you can win this fight.

--f

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-09 22:26     ` Francois Rouaix
@ 1999-10-10  5:38       ` skaller
  1999-10-10 20:44         ` William Chesters
  0 siblings, 1 reply; 22+ messages in thread
From: skaller @ 1999-10-10  5:38 UTC (permalink / raw)
  To: Francois Rouaix; +Cc: caml-list

Francois Rouaix wrote:
> 
> <skaller@maxtal.com.au> writes:
> 
>   skaller>      There are several cases where the core language prevents
>   skaller> this, because it lacks functionality available in C++: the
>   skaller> ability to create uninitialised values, and the ability to
>   skaller> destroy them are two that I've become aware of trying to build
>   skaller> a variable length array module.
> 
> Uninitialized values are easily implemented with the 'a option type.
> Of course, the code is then ugly, because you have to match your
> values everywhere to None | Some x. In C/C++ terms, this forces you
> to check for NULL pointers systematically, which is a Really Good Thing.

	There is a difference: Some/None must be matched EVERY use.
Null pointers only need to be checked where they can occur.
In the case of an array of variable length, with extra uninitialised
slots, the Some/None typing of each element is extra overhead that
isn't required if the accessing codes are correct, it is only
necessary to check if the index is in range.

> Adding uninitialised values is a major source of bugs, and it's kind
> of natural to pay the price for it in the readability of the source,
> if you want your code to be robust.

	No. This isn't necesarily the case. There are other
solutions. In C++, the philosophy is to provide as much protection
as possible, without costing at run time, and with protection
that does cost at run time being optional.

	For example, the designer of a class can determine
whether or not use of a variable of the class requires initialisation,
whether a default value makes sense, etc. [in this case, the client
of the class cannot override the designers decisions].
 
	To put this another way: static typing, and other safety guarrantees, 
are intrinsically limited in what can be expressed: it cannot prevent
all run time
errors, and it may _exclude_ correct cases as well. So it is always a
compromise.
It makes sense to improve these systems to be more expressive, and
provide
even better guarrantees, but there will (almost always) be a need to
escape
the system when the programmer knows better.

	In C++, such escapes are provided, for example with
new style casts, but in a way which discourages use, rather than
outright prevention. In ocaml, Obj.magic does some of the same
kind of things, I believe, and there is always the ability to
write C, or even patch the compiler.

	In general, I really think it is necessary to provide
a 'low level unsafe' interface in a language, which at least
will not be used by accident, and where deliberate uses are at least
easy to find. These low level features are not needed where reasonable
and complete safe solutions are available: for example, it has been
proved that 'goto' is not required in a language with a few sensible
looping constructions, and so one can argue this feature is not
required.

	I think you can argue that a powerful enough functional
language does _not_ require uninitialised values (that is,
there is always an efficient way to initialise variables),
but that isn't the case in a procedural language. Because
referential transparency is lost, compilers cannot reason as
easily about program structure, and hence cannot optimise away 
useless dummy initialisations, so efficiency is lost.

	
> Builtin arrays require you to provide a initialization value, even for
> zero-length array. Why don't you carry the same requirement to your
> extensible arrays, and simply use a polymorphic type:
> 
> type 'a earray = {
>      mutable current : 'a array;
>      mutable used : int;
>      }
> 
> let create n i = { current = Array.create n i; used = n }

	Sure, but that only works for the 'create' method.
In other cases, like, concatenation, the solution requires
testing if the two arrays to be concatenated are zero length,
if so creating a zero length array using [| |], otherwise
finding an object from one of the two arrays that contains one,
to initialise the result array.

	It is always possible to do this, but it is messy,
and it _requires_ that the physical length of an array used to
hold no elements be zero in cases when there is no available
'dummy' value to fill the unused slots.
 
> And then, if you want to have the equivalent of NULL pointers, use None,
> and option types everywhere.

	Ocaml arrays are already slower than the C ones Python uses,
and I am trying to match Python performance as closely as possible.

> It seems to me that you are trying to force the language to do something
> it has been purposely designed against. I'm not sure you can win this fight.

	I am not trying to do this per se, since I believe, more or less,
that the design principles are good. Rather, I would seek a 'proper'
solution,
in terms of these principles, believing one should exist, since the
principles
surely do not deliberately try to make inefficient code. This is why I
proposed
a categorical 'Initial' value, since this should fit well into the
category
theoretic type system framework. By having a special initial value,
which can be used to initialise anything, and testing for it in the
'safe'
versions of the generated code (and eliding the test in the unsafe
ones),
there may be a well typed solution, and a possibility the compiler can
reason about the code enough to optimise even the safe code.

	I happen to 'know' that this is the state of the art in case
of array bounds checking: in Modula code, apparently, 70% of the
mandatory
array bounds checks can be elided by adding suitable reasoning
algorithms
to the code.

	I do not make a numerical claim about the case of uninitialised
values, although I note that some C compilers including gcc can issue
warnings in some cases when it appears an uninitialised value is used.

	I think what I am saying, is that this kind of solution
probably _is_ within the spirit of ocaml. My particular suggestion
may be no good. Perhaps there is a better one?

	For example, a standard variable length array module would
solve the problem, but only in that special case. Is there a more
general, if not complete, solution, that is not as risky as the one I
propose?

-- 
John Skaller, mailto:skaller@maxtal.com.au
1/10 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
downloads: http://www.triode.net.au/~skaller




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-10  5:38       ` skaller
@ 1999-10-10 20:44         ` William Chesters
  1999-10-10 21:43           ` Hongwei Xi
  1999-10-11  0:36           ` skaller
  0 siblings, 2 replies; 22+ messages in thread
From: William Chesters @ 1999-10-10 20:44 UTC (permalink / raw)
  To: caml-list

skaller writes:
 > 	No. This isn't necesarily the case. There are other
 > solutions. In C++, the philosophy is to provide as much protection
 > as possible, without costing at run time, and with protection
 > that does cost at run time being optional.

Are you sure that initialising arrays etc. carries enough cost to be
worth avoiding?  After all, one of the two problems, namely the
requirement to keep legal values in slots at all times, is quite easy
to work around when you have to---my basic Vector is about 100 lines,
generously spaced---while the other, performance, worry seems a priori
likely to be misplaced, because if you are constructing an array then
your time complexity is presumably at least k×n for some nontrivial k,
so that the extra few instructions × n are unlikely to make a big
difference to the overall program, however annoying they look "in the
small".

ocaml already goes some way beyond what C++ considers acceptible
inefficiency.  That's fine for a vast number of applications on modern
desktop hardware.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-10 20:44         ` William Chesters
@ 1999-10-10 21:43           ` Hongwei Xi
  1999-10-11  0:36           ` skaller
  1 sibling, 0 replies; 22+ messages in thread
From: Hongwei Xi @ 1999-10-10 21:43 UTC (permalink / raw)
  To: William Chesters; +Cc: caml-list

Yes, I agree with the argument.

It is not clear to me that there is a significant amount of
efficiency lost on array initialization, which is always a
one-time thing. Is there any convincing data to support
otherwise?

However, there is one case which could result in some
significant loss of efficency (I learned this from
Robert Harper). The case is where you represent a sparse
array using an (ordinary) array representation (for
quick access). This means O(n^2) time is to be spent
on initialization, which may contribute to a significant
portion of the entire running time of some program.

Cheers,

--Hongwei

\~~~~/ \\   //  \\    //    @       Mail: hwxi@ececs.uc.edu
C-o^o,  ))__||   \\__//_  // \\     Url: http://www.ececs.uc.edu/~hwxi
(  ^ )  ))__||    \--/-\\     \\    Tel: +1 513 871 4947 (home)
/ \V\   ))  ||     //   \\     \\   Tel: +1 513 556 4762 (office)
------ //   || o  //     \\     \\//Fax: +1 513 556 7326 (department)
Rhodes Hall 811-D
Department of ECE & CS
University of Cincinnati
P. O. Box 210030
Cincinnati, OH 45221-0030

On Sun, 10 Oct 1999, William Chesters wrote:

> skaller writes:
>  > 	No. This isn't necesarily the case. There are other
>  > solutions. In C++, the philosophy is to provide as much protection
>  > as possible, without costing at run time, and with protection
>  > that does cost at run time being optional.
>
> Are you sure that initialising arrays etc. carries enough cost to be
> worth avoiding?  After all, one of the two problems, namely the
> requirement to keep legal values in slots at all times, is quite easy
> to work around when you have to---my basic Vector is about 100 lines,
> generously spaced---while the other, performance, worry seems a priori
> likely to be misplaced, because if you are constructing an array then
> your time complexity is presumably at least k×n for some nontrivial k,
> so that the extra few instructions × n are unlikely to make a big
> difference to the overall program, however annoying they look "in the
> small".
>
> ocaml already goes some way beyond what C++ considers acceptible
> inefficiency.  That's fine for a vast number of applications on modern
> desktop hardware.
>





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-10 20:44         ` William Chesters
  1999-10-10 21:43           ` Hongwei Xi
@ 1999-10-11  0:36           ` skaller
  1999-10-12  7:20             ` David Mentr{'e}
  1 sibling, 1 reply; 22+ messages in thread
From: skaller @ 1999-10-11  0:36 UTC (permalink / raw)
  To: William Chesters; +Cc: caml-list

William Chesters wrote:
> 
> skaller writes:
>  >      No. This isn't necesarily the case. There are other
>  > solutions. In C++, the philosophy is to provide as much protection
>  > as possible, without costing at run time, and with protection
>  > that does cost at run time being optional.
> 
> Are you sure that initialising arrays etc. carries enough cost to be
> worth avoiding?  

	No, I'm not. 

> After all, one of the two problems, namely the
> requirement to keep legal values in slots at all times, is quite easy
> to work around when you have to---my basic Vector is about 100 lines,
> generously spaced---while the other, performance, worry seems a priori
> likely to be misplaced, because if you are constructing an array then
> your time complexity is presumably at least k×n for some nontrivial k,
> so that the extra few instructions × n are unlikely to make a big
> difference to the overall program, however annoying they look "in the
> small".
> 
> ocaml already goes some way beyond what C++ considers acceptible
> inefficiency.  That's fine for a vast number of applications on modern
> desktop hardware.

Unfortunately, when writing an interpreter such as one for Python,
reasonable efficieny plus some other advantages seem neccessary
to get any users at all. JPython (Java implementation), 
for example, is 3 times slower than CPython (C implementation),
and that was enough for me to discount it.

In fact, I am writing a python interpreter because CPython
is vastly (several THOUSAND times) too slow for what I need:
extremely fast string operations are required to support
Interscript, my literate programming tool. Example: Interscript
includes documentation for ISO10646 (unicode) characters,
the complete build of interscript itself, which is written
in interscript, takes several HOURS on my 120Mhz pentium:
the generated documentation is around 5Meg, mainly consisting
of the character code tables.

This is mainly because I have to do things like convert 
a string or plain text to HTML, which requires replacing
characters '<' with '&lt;'; that is, scan each individual
character .. in an interpretive loop.

It may seem strange to believe I can actually achieve this
performance in ocaml! But I do! Because of the 'higher level
algorithms are easier to write in ocaml than C' factor,
optimisations using techniques such as type inference, 
special pattern recognition, etc,
and generation of fast C or assembler, are possible.

FYI the design is to use the interpreter only to load
the modules required for a main program, then do
whole program analysis to generate a single executable.
But because Python also has dynamic features like 'exec string'
it is necessary to make the interpreter available at run time,
even in the compiled code. So it needs to be reasonably efficient.

-- 
John Skaller, mailto:skaller@maxtal.com.au
1/10 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
downloads: http://www.triode.net.au/~skaller

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Stdlib regularity
  1999-10-11  0:36           ` skaller
@ 1999-10-12  7:20             ` David Mentr{'e}
  0 siblings, 0 replies; 22+ messages in thread
From: David Mentr{'e} @ 1999-10-12  7:20 UTC (permalink / raw)
  To: skaller; +Cc: William Chesters, caml-list

skaller <skaller@maxtal.com.au> writes:

> This is mainly because I have to do things like convert 
> a string or plain text to HTML, which requires replacing
> characters '<' with '&lt;'; that is, scan each individual
> character .. in an interpretive loop.

It could be choking for you, especially in a Caml mailing-list, but have
you:

 1. consider using regular expressions? They are typically made for the
    kind of thing you are trying to do. And regex engines have
    optimization. 

 2. consider using the Perl language? Is far from perfect, not very
    clean (in the Caml way at least (in other way either ;)) but has a
    very powerful bultin regex engine. Even without Perl, OCaml has a
    regex library I think.

Any way, I think you should have a look at _Mastering Regular
Expressions_ O'Reilly book.

Best regards,
d.
-- 
 David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/
 Opinions expressed here are only mine.




^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~1999-10-14 12:56 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-10-12 16:21 Stdlib regularity Damien Doligez
1999-10-13 12:18 ` Matías Giovannini
  -- strict thread matches above, loose matches on Subject: below --
1999-10-09 16:58 William Chesters
1999-10-10  0:11 ` Matías Giovannini
1999-10-06 13:25 Ohad Rodeh
1999-10-06 16:18 ` Markus Mottl
1999-10-08 14:06   ` Matías Giovannini
1999-10-10 20:09     ` Pierre Weis
1999-10-10 20:12       ` Matías Giovannini
1999-10-08 14:10   ` skaller
1999-10-08 19:21     ` Markus Mottl
1999-10-09 21:14     ` Dave Mason
1999-10-06 18:50 ` John Prevost
1999-10-07  7:33 ` skaller
1999-10-07  9:18 ` Francisco Valverde Albacete
1999-10-08 14:56   ` skaller
1999-10-09 22:26     ` Francois Rouaix
1999-10-10  5:38       ` skaller
1999-10-10 20:44         ` William Chesters
1999-10-10 21:43           ` Hongwei Xi
1999-10-11  0:36           ` skaller
1999-10-12  7:20             ` David Mentr{'e}

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox