* Re: Stdlib regularity @ 1999-10-12 16:21 Damien Doligez 1999-10-13 12:18 ` Matías Giovannini 0 siblings, 1 reply; 22+ messages in thread From: Damien Doligez @ 1999-10-12 16:21 UTC (permalink / raw) To: caml-list >From: =?iso-8859-1?Q?Mat=EDas?= Giovannini <matias@k-bell.com> >There is no way to pre-load a prelude file in the interpreter without >relinking a custom runtime, is it? If you mean the toplevel system, there is, and it's in the manual: (from <http://caml.inria.fr/ocaml/htmlman/node9.html>) >On start-up (before the first phrase is read), if the file .ocamlinit >exists in the current directory, its contents are read as a sequence >of Objective Caml phrases and executed as per the #use directive >described in section 9.2. The evaluation outcode for each phrase are >not displayed. -- Damien ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-12 16:21 Stdlib regularity Damien Doligez @ 1999-10-13 12:18 ` Matías Giovannini 0 siblings, 0 replies; 22+ messages in thread From: Matías Giovannini @ 1999-10-13 12:18 UTC (permalink / raw) To: caml-list; +Cc: Damien Doligez Damien Doligez wrote: > > >From: =?iso-8859-1?Q?Mat=EDas?= Giovannini <matias@k-bell.com> > > >There is no way to pre-load a prelude file in the interpreter without > >relinking a custom runtime, is it? > > If you mean the toplevel system, there is, and it's in the manual: > > (from <http://caml.inria.fr/ocaml/htmlman/node9.html>) > > >On start-up (before the first phrase is read), if the file .ocamlinit > >exists in the current directory, its contents are read as a sequence > >of Objective Caml phrases and executed as per the #use directive > >described in section 9.2. The evaluation outcode for each phrase are > >not displayed. > > -- Damien Wow, I always thought it only worked on UNIX, but it actually does work on MacOS too. Thanks for pointing this out. -- I got your message. I couldn't read it. It was a cryptogram. -- Laurie Anderson ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity @ 1999-10-09 16:58 William Chesters 1999-10-10 0:11 ` Matías Giovannini 0 siblings, 1 reply; 22+ messages in thread From: William Chesters @ 1999-10-09 16:58 UTC (permalink / raw) To: caml-list [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 162 bytes --] Matías Giovannini wrote: > And then a "functional for" loop looks like > > List.map (fun i -> ...) (iota n) What about Array.init n (fun i -> ...) ? :-) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-09 16:58 William Chesters @ 1999-10-10 0:11 ` Matías Giovannini 0 siblings, 0 replies; 22+ messages in thread From: Matías Giovannini @ 1999-10-10 0:11 UTC (permalink / raw) To: caml-list William Chesters wrote: > > Matías Giovannini wrote: > > And then a "functional for" loop looks like > > > > List.map (fun i -> ...) (iota n) > > What about Array.init n (fun i -> ...) ? :-) How un-ref-trans! :-) -- I got your message. I couldn't read it. It was a cryptogram. -- Laurie Anderson ^ permalink raw reply [flat|nested] 22+ messages in thread
* Stdlib regularity @ 1999-10-06 13:25 Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl ` (3 more replies) 0 siblings, 4 replies; 22+ messages in thread From: Ohad Rodeh @ 1999-10-06 13:25 UTC (permalink / raw) To: caml-list Caml list, I have used OCaml extensively in the past few years, and I've had some misgivings about the CAML standard library argument ordering. It is a little bit confusing and not standard. For example: val Queue.add: 'a -> 'a t -> unit val Hashtbl.add: ('a,'b) t -> 'a -> 'b -> unit My general suggestion is to always make the first argument the <'a t> type and the second the <'a> type. The only exception to this rule should be functionals, for example, in Queue: val iter: ('a -> unit) -> 'a t -> unit I've summed up the proposed changes in an order of importance, please remember that this is suggestion based on my personal taste alone. The changes I'm most interested are: Module Queue: switch: val add: 'a -> 'a t -> unit to: val add: 'a t -> 'a -> unit Module Stack: switch: val push: 'a -> 'a t -> unit to: val push: 'a t -> 'a -> unit Module Stream: switch: npeek : int -> 'a t -> 'a list;; to: npeek : 'a t -> int -> 'a list;; This make the data-structure modules (Hashtbl,Queue,Stack,Stream) behave the same. If this is possible, I'd like this to apply to the Map and Set modules. For module Map, this is the current signature: module type OrderedType = sig type t val compare: t -> t -> int end module type S = sig type key type 'a t val empty: 'a t val add: key -> 'a -> 'a t -> 'a t val find: key -> 'a t -> 'a val remove: key -> 'a t -> 'a t val mem: key -> 'a t -> bool val iter: (key -> 'a -> unit) -> 'a t -> unit val map: ('a -> 'b) -> 'a t -> 'b t val fold: (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b end end module Make(Ord: OrderedType): (S with type key = Ord.t) I'd rather make it: module type S = sig type key type 'a t val empty: 'a t val add: 'a t -> key -> 'a -> 'a t val find: 'a t -> key -> 'a val remove: 'a t -> key -> 'a t val mem: 'a t -> key -> bool val iter: (key -> 'a -> unit) -> 'a t -> unit val map: ('a -> 'b) -> 'a t -> 'b t val fold: (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b end For module Set, this is the current signature: module type OrderedType = sig type t val compare: t -> t -> int end module type S = sig type elt type t val empty: t val is_empty: t -> bool val mem: elt -> t -> bool val add: elt -> t -> t val singleton: elt -> t val remove: elt -> t -> t val union: t -> t -> t val inter: t -> t -> t val diff: t -> t -> t val compare: t -> t -> int val equal: t -> t -> bool val subset: t -> t -> bool val iter: (elt -> unit) -> t -> unit val fold: (elt -> 'a -> 'a) -> t -> 'a -> 'a val cardinal: t -> int val elements: t -> elt list val min_elt: t -> elt val max_elt: t -> elt val choose: t -> elt end module Make(Ord: OrderedType): (S with type elt = Ord.t) Id rather switch S to: module type S = sig type elt type t val empty: t val is_empty: t -> bool val mem: t -> elt -> bool val add: t -> elt -> t val singleton: elt -> t val remove: t -> elt -> t val union: t -> t -> t val inter: t -> t -> t val diff: t -> t -> t val compare: t -> t -> int val equal: t -> t -> bool val subset: t -> t -> bool val iter: (elt -> unit) -> t -> unit val fold: (elt -> 'a -> 'a) -> t -> 'a -> 'a val cardinal: t -> int val elements: t -> elt list val min_elt: t -> elt val max_elt: t -> elt val choose: t -> elt end Module List has some of the same functions (mem,remove), so my suggestion is: switch: val mem : 'a -> 'a list -> bool val memq : 'a -> 'a list -> bool to: val mem : 'a list -> 'a -> bool val memq : 'a list -> 'a -> bool switch: val assoc : 'a -> ('a * 'b) list -> 'b val assq : 'a -> ('a * 'b) list -> 'b val mem_assoc : 'a -> ('a * 'b) list -> bool val mem_assq : 'a -> ('a * 'b) list -> bool val remove_assoc : 'a -> ('a * 'b) list -> ('a * 'b) list val remove_assq : 'a -> ('a * 'b) list -> ('a * 'b) list to: val assoc : ('a * 'b) list -> 'a -> 'b val assq : ('a * 'b) list -> 'a-> 'b val mem_assoc : ('a * 'b) list -> 'a -> bool val mem_assq : ('a * 'b) list -> 'a-> bool val remove_assoc : ('a * 'b) list -> 'a -> ('a * 'b) list val remove_assq : ('a * 'b) list -> 'a -> ('a * 'b) list The more important changes are the first minor 3, the rest are optional. What do you think? Ohad. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Ohad Rodeh @ 1999-10-06 16:18 ` Markus Mottl 1999-10-08 14:06 ` Matías Giovannini 1999-10-08 14:10 ` skaller 1999-10-06 18:50 ` John Prevost ` (2 subsequent siblings) 3 siblings, 2 replies; 22+ messages in thread From: Markus Mottl @ 1999-10-06 16:18 UTC (permalink / raw) To: Ohad Rodeh; +Cc: OCAML > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. It > is a little bit confusing and not standard. For example: Although the standard library is quite ok, there are some (minor) inconsistencies. What concerns my wishes for it, I'd love to see more features (= functions or even modules). At the moment I am not always linking against the standard library, but I use own modules, which I have extended a bit, because I need some important features all of the time (e.g. why is there no "partition"-function in the set-module?). What do you think about this proposal: why not put a version of the standard library on the CVS-server of INRIA, where volunteers can contribute extensions, replacements, new modules, etc.? >From time to time, the maintainers of OCaml might want to take a look at the diffs to the original library and merge some (all?) of the goodies into the main branch. I can imagine that you have a lot of patches which are just waiting to be uploaded... Best regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-06 16:18 ` Markus Mottl @ 1999-10-08 14:06 ` Matías Giovannini 1999-10-10 20:09 ` Pierre Weis 1999-10-08 14:10 ` skaller 1 sibling, 1 reply; 22+ messages in thread From: Matías Giovannini @ 1999-10-08 14:06 UTC (permalink / raw) To: OCAML Markus Mottl wrote: > > > I have used OCaml extensively in the past few years, and I've had > > some misgivings about the CAML standard library argument ordering. It > > is a little bit confusing and not standard. For example: > > Although the standard library is quite ok, there are some (minor) > inconsistencies. What concerns my wishes for it, I'd love to see more > features (= functions or even modules). Yes! Yes! I always begin my Caml code by writing iota, and I wish it were included in the standard library. It's silly simple, and imprescindible. let iota n = let rec aux l n = if n > 0 then aux (n::l) (n-1) else l in aux [] n And then a "functional for" loop looks like List.map (fun i -> ...) (iota n) (Incidentaly, the name "iota" comes from APL, and stands for the "Initial natural Interval".) -- I got your message. I couldn't read it. It was a cryptogram. -- Laurie Anderson ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:06 ` Matías Giovannini @ 1999-10-10 20:09 ` Pierre Weis 1999-10-10 20:12 ` Matías Giovannini 0 siblings, 1 reply; 22+ messages in thread From: Pierre Weis @ 1999-10-10 20:09 UTC (permalink / raw) To: matias; +Cc: caml-list > Yes! Yes! I always begin my Caml code by writing iota, and I wish it > were included in the standard library. It's silly simple, and imprescindible. > > let iota n = > let rec aux l n = > if n > 0 then aux (n::l) (n-1) else l > in aux [] n We may reuse this ``prelude'' code that slightly generalizes iota (named range in this version of the standard library): (*\ \subsection{Lists of consecutive integers} \index{Lists of consecutive integers} \begin{caml_primitive} interval range \end{caml_primitive} \begin{itemize} \item \verb"interval n m" returns the list of all integers in increasing order, from \verb"n" to \verb"m". \item \verb"range n" gives the first n integers. \end{itemize} \*) let interval n m = let rec loop l m = if n > m then l else loop (m :: l) (pred m) in loop [] m;; let range = interval 1;; Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-10 20:09 ` Pierre Weis @ 1999-10-10 20:12 ` Matías Giovannini 0 siblings, 0 replies; 22+ messages in thread From: Matías Giovannini @ 1999-10-10 20:12 UTC (permalink / raw) Cc: caml-list Pierre Weis wrote: > > > Yes! Yes! I always begin my Caml code by writing iota, and I wish it > > were included in the standard library. It's silly simple, and imprescindible. > We may reuse this ``prelude'' code that slightly generalizes iota (named > range in this version of the standard library): There is no way to pre-load a prelude file in the interpreter without relinking a custom runtime, is it? -- I got your message. I couldn't read it. It was a cryptogram. -- Laurie Anderson ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-06 16:18 ` Markus Mottl 1999-10-08 14:06 ` Matías Giovannini @ 1999-10-08 14:10 ` skaller 1999-10-08 19:21 ` Markus Mottl 1999-10-09 21:14 ` Dave Mason 1 sibling, 2 replies; 22+ messages in thread From: skaller @ 1999-10-08 14:10 UTC (permalink / raw) To: Markus Mottl; +Cc: Ohad Rodeh, OCAML Markus Mottl wrote: > What do you think about this proposal: why not put a version of the > standard library on the CVS-server of INRIA, where volunteers can > contribute extensions, replacements, new modules, etc.? I think this is a good idea, but I think that implementing extensions and new modules is relatively easy (in most cases) but deciding on the best interfaces is not. Even the best library designer cannot make decision without user feedback. As a member of two ISO Working Groups, my experience is that there is something to be said for proposing changes in a slightly formal manner, followed by discussion of the proposal. There is another advantage of this approach, which is that the 'offical' ocaml developers can indicate tentative support for some changes, allowing users to try them out with modules with the same interface but potentially less efficient implementations, to gain experience with these interfaces, and to write code that will work well with the next official release. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:10 ` skaller @ 1999-10-08 19:21 ` Markus Mottl 1999-10-09 21:14 ` Dave Mason 1 sibling, 0 replies; 22+ messages in thread From: Markus Mottl @ 1999-10-08 19:21 UTC (permalink / raw) To: skaller; +Cc: OCAML > I think this is a good idea, but I think that implementing > extensions and new modules is relatively easy (in most cases) > but deciding on the best interfaces is not. Deciding on interfaces is probably the most important (and difficult) problem in library design. Finding suitable (=intuitive) names for functions alone can be a difficult task. No sensible contributor would check in changes that effect the interface without discussing this with other developers/contributors before. So if there is a good means of communication (e.g. a specialized mailing list) and if people adhere to strict rules concerning check-in, I am quite convinced that this would work well. Regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:10 ` skaller 1999-10-08 19:21 ` Markus Mottl @ 1999-10-09 21:14 ` Dave Mason 1 sibling, 0 replies; 22+ messages in thread From: Dave Mason @ 1999-10-09 21:14 UTC (permalink / raw) To: OCAML >>>>> On Sat, 09 Oct 1999 00:10:48 +1000, skaller <skaller@maxtal.com.au> said: > Markus Mottl wrote: >> What do you think about this proposal: why not put a version of the >> standard library on the CVS-server of INRIA, where volunteers can >> contribute extensions, replacements, new modules, etc.? > I think this is a good idea, but I think that implementing > extensions and new modules is relatively easy (in most cases) but > deciding on the best interfaces is not. > Even the best library designer cannot make decision without user > feedback. As a member of two ISO Working Groups, my experience is > that there is something to be said for proposing changes in a > slightly formal manner, followed by discussion of the proposal. For another take on this, see the SRFI process that has been in use for the last year in the Scheme community. http://srfi.schemers.org/ Things are a bit different in caml since there is really only one implementation organization (although I guess there is ocaml and caml). ../Dave ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl @ 1999-10-06 18:50 ` John Prevost 1999-10-07 7:33 ` skaller 1999-10-07 9:18 ` Francisco Valverde Albacete 3 siblings, 0 replies; 22+ messages in thread From: John Prevost @ 1999-10-06 18:50 UTC (permalink / raw) To: Ohad Rodeh; +Cc: caml-list Ohad Rodeh <orodeh@cs.cornell.edu> writes: > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. It > is a little bit confusing and not standard. For example: > > val Queue.add: 'a -> 'a t -> unit > val Hashtbl.add: ('a,'b) t -> 'a -> 'b -> unit > > My general suggestion is to always make the first argument the <'a t> > type and the second the <'a> type. The only exception to this rule > should be functionals, for example, in Queue: > > val iter: ('a -> unit) -> 'a t -> unit I have a slightly different proposal, but one which is along the same lines: Standard ordering is: val ho_func : ('a -> unit) -> 'a t -> unit val ho_func : ('a -> 'b) -> 'a t -> 'b t val imp_func : 'a t -> 'a -> unit val func : 'a -> 'a t -> 'a t Rationale: The basic rationale is that the most often-repeated item should be at the front to make it easier to curry. Along with this, there's the desire to have a consistent style for ordering arguments. The basic style I propose is that for functions acting on some sort of agregate data type the "importance" of the arguments is as follows: Any higher-order function gets its function arguments first. (Idea: the function argument determines the meaning of the function. Hence it should be closer to the function than the other arguments. Corollary: non-function arguments that determine the meaning of a function should also bind closely.) In an imperative case, the agregate argument should come next, after any behavior-determining arguments, but before any single values. (Idea: in an imperative case, the value "acted upon" is more important than the value used in the action--sort of like direct object vs. indirect object.) i.e. give john pizza ==> john is the "aggregate", pizza is the value used in the action. In the non-imperative case, the "value" place should come before the aggregate case--this is because we're no longer "acting on" something. Now that we're not, the value determines the meaning of the function which is applied to the aggregate, returning a new aggregate. In essence, the function should be thought of in this case as taking an argument and returning a new function, like map does, rather than acting on something after receiving multiple arguments. So, the basic ordering: val func : determiners -> arguments -> result val Queue.add : 'a t -> 'a -> unit + val Hashtbl.add : ('a,'b) t -> 'a -> 'b -> unit + val Queue.iter : ('a -> unit) -> 'a t -> unit val Stream.npeek : int -> 'a t -> 'a list * val S.add : key -> 'a -> 'a t -> 'a t * val S.find : 'a t -> key -> 'a * val S.remove : key -> 'a t -> 'a t * val S.mem : key -> 'a t -> bool * ... The * is where I disagree with Ohad's strategy. The + is where I disagree with O'Caml's. As a note, O'Caml's strategy makes imperative functions order arguments more like pure functions. This may make the default currying order less useful, but is probably better than my strategy. Hence, O'Caml's order is pretty good. :) John. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl 1999-10-06 18:50 ` John Prevost @ 1999-10-07 7:33 ` skaller 1999-10-07 9:18 ` Francisco Valverde Albacete 3 siblings, 0 replies; 22+ messages in thread From: skaller @ 1999-10-07 7:33 UTC (permalink / raw) To: Ohad Rodeh; +Cc: caml-list Ohad Rodeh wrote: > > Caml list, > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. > What do you think? While I tend to agree with the sentiment (and, apart from just remembering the ordering, the most likely Currying isn't possible when the data structure type isn't the first argument), it would create a compatibility problem to just change the existing library, and a mess to add a new set of modules just to 'fix' the argument order to be slightly more intuitive. I guess the original reasoning was more to do with reading order: List.mem element theList reads well as element <is member of> theList This kind of issue (argument order) was much more important in C++, where generics represented by templates _mandate_ consistency (in the naming conventions as well). Even before the STL was finalised, it was being used enough that people argued against changing it to avoid breaking code. It is probably more important to consider how to introduce FISh 2 style polymorphism, in which functions like 'map' and 'iter' can be applied to _any_ data structure. In that case, you gain consistency automatically, since there is _really_ only one such algorithm for all data structures :-) A more limited way to achieve this is to use an STL style library; that is, iterator based algorithms, with the client supplying the iterators. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Ohad Rodeh ` (2 preceding siblings ...) 1999-10-07 7:33 ` skaller @ 1999-10-07 9:18 ` Francisco Valverde Albacete 1999-10-08 14:56 ` skaller 3 siblings, 1 reply; 22+ messages in thread From: Francisco Valverde Albacete @ 1999-10-07 9:18 UTC (permalink / raw) To: Ohad Rodeh; +Cc: caml-list Ohad Rodeh wrote: > Caml list, > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. It > is a little bit confusing and not standard. For example: Greetings, everybody, I have my opinion about argument order too! However, I have also read the proposals by other people and I guess all of us have their bit of reason. C'mon, even the implementors of the language and its their own child! Now, the work of the OLabl group is much more daring and advanced than any of the proposals *we* can make: It allows free order of parameters (with certain restrictions). It involves also named parameters, missing arguments and default values (which can deal more elegantly with obnoxious default comparison functions for aggregate types like sets and priority queues). It can also accept OCaml-like function types. Take a look at http://pauillac.inria.fr/olabl Alas, for one of particular "features" above mentioned (I don't remember which) there is a time penalty in runtime, although minor if I remember well (Any explanations from J.Garrigue or J.P. Furuse)? But we never got into OCaml strictly for time-efficiency reasons, did we? The most important consideration is that the OLabl people are not *the OCaml people* and their work remains different (if faithful to releases of the original language. I guess both teams are closely related?). And that has always been my main deterrent for not adhering to OLabl: I use all its tools *except* this particular feature with the types of functions, for fear I'll be left out of the main OCaml tool development. Any gues what comes next? *YES* can I ask the OCaml implemtors if there are any plans to consider implementing OLabl's typing discipline/suggestions as an alternative to OCaml's more rigid one while maintaining the present scheme for compatibility's sake? Thanks for your attention! Francisco Valverde Universidad Carlos III de Madrid Resume' en francais: Pourquoi ne profite-t-on pas des advantages de la discipline de typage de OLabl dans OCaml? Merci. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-07 9:18 ` Francisco Valverde Albacete @ 1999-10-08 14:56 ` skaller 1999-10-09 22:26 ` Francois Rouaix 0 siblings, 1 reply; 22+ messages in thread From: skaller @ 1999-10-08 14:56 UTC (permalink / raw) To: Francisco Valverde Albacete; +Cc: Ohad Rodeh, caml-list Francisco Valverde Albacete wrote: > But we never got into > OCaml strictly for time-efficiency reasons, did we? In my case, no, but it is still vital, otherwise I'd use Haskell instead :-) In particular, because procedural algorithms can be implemented in ocaml just like in C++, the same complexity _should_ be obtainable, and work on optimisation plus standard library support should allow 'close' to unit performance factors. There are several cases where the core language prevents this, because it lacks functionality available in C++: the ability to create uninitialised values, and the ability to destroy them are two that I've become aware of trying to build a variable length array module. Both these features seem mandatory. Both are unsafe. A reasonable compromise might be: do not implement the features in the standard bytecode interpreter do not allow these features in the compiler unless a special switch is specified Unfortunately, I do not think it is 'enough' to provide a variable length array in the standard library, because that is only one data structure which cannot be implemented efficiently or cleanly without these features. There is another more serious problem: ocaml doesn't handle recursive types well. I'm not sure I fully understand this. When all values are boxed, all recursion of algebraic types is sound. [Proof: it is sound in C, where all non datum values are represented by pointers; note that _initialising_ the structure may not be possible in ocaml unless uninitialised values are permitted] It is not clear to me if the result extends to modules. However, the lack of recursion across module boundaries is also pain. In trying to implement a variable length array, I found that I needed to work around the inability to create an uninitialised array by using a functor whose argument supplied the array value type and special value of that type to initialise the array with. Unfortunately, the type of the instantiated functor's aggregate component was a variant of the type over which it needed to be instantiated. So this solution cannot work. To exhibit the problem more plainly, it is something like: type t' = X | Y of G.t where module G = V.Make(sig type t = t'; val default = X) [In the actual code, the type of a python expression, 'expr', includes a case Initial suitable for initialising an array, and a case 'V of expr varray', where varray is a variable length array of expressions, this works as is, but I cannot make varray from a functor that needs t = expr and default = Initial; even if the recursion could work, there is no syntax for it] -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:56 ` skaller @ 1999-10-09 22:26 ` Francois Rouaix 1999-10-10 5:38 ` skaller 0 siblings, 1 reply; 22+ messages in thread From: Francois Rouaix @ 1999-10-09 22:26 UTC (permalink / raw) To: skaller; +Cc: caml-list <skaller@maxtal.com.au> writes: skaller> There are several cases where the core language prevents skaller> this, because it lacks functionality available in C++: the skaller> ability to create uninitialised values, and the ability to skaller> destroy them are two that I've become aware of trying to build skaller> a variable length array module. Uninitialized values are easily implemented with the 'a option type. Of course, the code is then ugly, because you have to match your values everywhere to None | Some x. In C/C++ terms, this forces you to check for NULL pointers systematically, which is a Really Good Thing. Adding uninitialised values is a major source of bugs, and it's kind of natural to pay the price for it in the readability of the source, if you want your code to be robust. skaller> [About a functor for an extensible array type, and the problem skaller> of a dummy value] Builtin arrays require you to provide a initialization value, even for zero-length array. Why don't you carry the same requirement to your extensible arrays, and simply use a polymorphic type: type 'a earray = { mutable current : 'a array; mutable used : int; } let create n i = { current = Array.create n i; used = n } And then, if you want to have the equivalent of NULL pointers, use None, and option types everywhere. It seems to me that you are trying to force the language to do something it has been purposely designed against. I'm not sure you can win this fight. --f ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-09 22:26 ` Francois Rouaix @ 1999-10-10 5:38 ` skaller 1999-10-10 20:44 ` William Chesters 0 siblings, 1 reply; 22+ messages in thread From: skaller @ 1999-10-10 5:38 UTC (permalink / raw) To: Francois Rouaix; +Cc: caml-list Francois Rouaix wrote: > > <skaller@maxtal.com.au> writes: > > skaller> There are several cases where the core language prevents > skaller> this, because it lacks functionality available in C++: the > skaller> ability to create uninitialised values, and the ability to > skaller> destroy them are two that I've become aware of trying to build > skaller> a variable length array module. > > Uninitialized values are easily implemented with the 'a option type. > Of course, the code is then ugly, because you have to match your > values everywhere to None | Some x. In C/C++ terms, this forces you > to check for NULL pointers systematically, which is a Really Good Thing. There is a difference: Some/None must be matched EVERY use. Null pointers only need to be checked where they can occur. In the case of an array of variable length, with extra uninitialised slots, the Some/None typing of each element is extra overhead that isn't required if the accessing codes are correct, it is only necessary to check if the index is in range. > Adding uninitialised values is a major source of bugs, and it's kind > of natural to pay the price for it in the readability of the source, > if you want your code to be robust. No. This isn't necesarily the case. There are other solutions. In C++, the philosophy is to provide as much protection as possible, without costing at run time, and with protection that does cost at run time being optional. For example, the designer of a class can determine whether or not use of a variable of the class requires initialisation, whether a default value makes sense, etc. [in this case, the client of the class cannot override the designers decisions]. To put this another way: static typing, and other safety guarrantees, are intrinsically limited in what can be expressed: it cannot prevent all run time errors, and it may _exclude_ correct cases as well. So it is always a compromise. It makes sense to improve these systems to be more expressive, and provide even better guarrantees, but there will (almost always) be a need to escape the system when the programmer knows better. In C++, such escapes are provided, for example with new style casts, but in a way which discourages use, rather than outright prevention. In ocaml, Obj.magic does some of the same kind of things, I believe, and there is always the ability to write C, or even patch the compiler. In general, I really think it is necessary to provide a 'low level unsafe' interface in a language, which at least will not be used by accident, and where deliberate uses are at least easy to find. These low level features are not needed where reasonable and complete safe solutions are available: for example, it has been proved that 'goto' is not required in a language with a few sensible looping constructions, and so one can argue this feature is not required. I think you can argue that a powerful enough functional language does _not_ require uninitialised values (that is, there is always an efficient way to initialise variables), but that isn't the case in a procedural language. Because referential transparency is lost, compilers cannot reason as easily about program structure, and hence cannot optimise away useless dummy initialisations, so efficiency is lost. > Builtin arrays require you to provide a initialization value, even for > zero-length array. Why don't you carry the same requirement to your > extensible arrays, and simply use a polymorphic type: > > type 'a earray = { > mutable current : 'a array; > mutable used : int; > } > > let create n i = { current = Array.create n i; used = n } Sure, but that only works for the 'create' method. In other cases, like, concatenation, the solution requires testing if the two arrays to be concatenated are zero length, if so creating a zero length array using [| |], otherwise finding an object from one of the two arrays that contains one, to initialise the result array. It is always possible to do this, but it is messy, and it _requires_ that the physical length of an array used to hold no elements be zero in cases when there is no available 'dummy' value to fill the unused slots. > And then, if you want to have the equivalent of NULL pointers, use None, > and option types everywhere. Ocaml arrays are already slower than the C ones Python uses, and I am trying to match Python performance as closely as possible. > It seems to me that you are trying to force the language to do something > it has been purposely designed against. I'm not sure you can win this fight. I am not trying to do this per se, since I believe, more or less, that the design principles are good. Rather, I would seek a 'proper' solution, in terms of these principles, believing one should exist, since the principles surely do not deliberately try to make inefficient code. This is why I proposed a categorical 'Initial' value, since this should fit well into the category theoretic type system framework. By having a special initial value, which can be used to initialise anything, and testing for it in the 'safe' versions of the generated code (and eliding the test in the unsafe ones), there may be a well typed solution, and a possibility the compiler can reason about the code enough to optimise even the safe code. I happen to 'know' that this is the state of the art in case of array bounds checking: in Modula code, apparently, 70% of the mandatory array bounds checks can be elided by adding suitable reasoning algorithms to the code. I do not make a numerical claim about the case of uninitialised values, although I note that some C compilers including gcc can issue warnings in some cases when it appears an uninitialised value is used. I think what I am saying, is that this kind of solution probably _is_ within the spirit of ocaml. My particular suggestion may be no good. Perhaps there is a better one? For example, a standard variable length array module would solve the problem, but only in that special case. Is there a more general, if not complete, solution, that is not as risky as the one I propose? -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-10 5:38 ` skaller @ 1999-10-10 20:44 ` William Chesters 1999-10-10 21:43 ` Hongwei Xi 1999-10-11 0:36 ` skaller 0 siblings, 2 replies; 22+ messages in thread From: William Chesters @ 1999-10-10 20:44 UTC (permalink / raw) To: caml-list skaller writes: > No. This isn't necesarily the case. There are other > solutions. In C++, the philosophy is to provide as much protection > as possible, without costing at run time, and with protection > that does cost at run time being optional. Are you sure that initialising arrays etc. carries enough cost to be worth avoiding? After all, one of the two problems, namely the requirement to keep legal values in slots at all times, is quite easy to work around when you have to---my basic Vector is about 100 lines, generously spaced---while the other, performance, worry seems a priori likely to be misplaced, because if you are constructing an array then your time complexity is presumably at least k×n for some nontrivial k, so that the extra few instructions × n are unlikely to make a big difference to the overall program, however annoying they look "in the small". ocaml already goes some way beyond what C++ considers acceptible inefficiency. That's fine for a vast number of applications on modern desktop hardware. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-10 20:44 ` William Chesters @ 1999-10-10 21:43 ` Hongwei Xi 1999-10-11 0:36 ` skaller 1 sibling, 0 replies; 22+ messages in thread From: Hongwei Xi @ 1999-10-10 21:43 UTC (permalink / raw) To: William Chesters; +Cc: caml-list Yes, I agree with the argument. It is not clear to me that there is a significant amount of efficiency lost on array initialization, which is always a one-time thing. Is there any convincing data to support otherwise? However, there is one case which could result in some significant loss of efficency (I learned this from Robert Harper). The case is where you represent a sparse array using an (ordinary) array representation (for quick access). This means O(n^2) time is to be spent on initialization, which may contribute to a significant portion of the entire running time of some program. Cheers, --Hongwei \~~~~/ \\ // \\ // @ Mail: hwxi@ececs.uc.edu C-o^o, ))__|| \\__//_ // \\ Url: http://www.ececs.uc.edu/~hwxi ( ^ ) ))__|| \--/-\\ \\ Tel: +1 513 871 4947 (home) / \V\ )) || // \\ \\ Tel: +1 513 556 4762 (office) ------ // || o // \\ \\//Fax: +1 513 556 7326 (department) Rhodes Hall 811-D Department of ECE & CS University of Cincinnati P. O. Box 210030 Cincinnati, OH 45221-0030 On Sun, 10 Oct 1999, William Chesters wrote: > skaller writes: > > No. This isn't necesarily the case. There are other > > solutions. In C++, the philosophy is to provide as much protection > > as possible, without costing at run time, and with protection > > that does cost at run time being optional. > > Are you sure that initialising arrays etc. carries enough cost to be > worth avoiding? After all, one of the two problems, namely the > requirement to keep legal values in slots at all times, is quite easy > to work around when you have to---my basic Vector is about 100 lines, > generously spaced---while the other, performance, worry seems a priori > likely to be misplaced, because if you are constructing an array then > your time complexity is presumably at least k×n for some nontrivial k, > so that the extra few instructions × n are unlikely to make a big > difference to the overall program, however annoying they look "in the > small". > > ocaml already goes some way beyond what C++ considers acceptible > inefficiency. That's fine for a vast number of applications on modern > desktop hardware. > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-10 20:44 ` William Chesters 1999-10-10 21:43 ` Hongwei Xi @ 1999-10-11 0:36 ` skaller 1999-10-12 7:20 ` David Mentr{'e} 1 sibling, 1 reply; 22+ messages in thread From: skaller @ 1999-10-11 0:36 UTC (permalink / raw) To: William Chesters; +Cc: caml-list William Chesters wrote: > > skaller writes: > > No. This isn't necesarily the case. There are other > > solutions. In C++, the philosophy is to provide as much protection > > as possible, without costing at run time, and with protection > > that does cost at run time being optional. > > Are you sure that initialising arrays etc. carries enough cost to be > worth avoiding? No, I'm not. > After all, one of the two problems, namely the > requirement to keep legal values in slots at all times, is quite easy > to work around when you have to---my basic Vector is about 100 lines, > generously spaced---while the other, performance, worry seems a priori > likely to be misplaced, because if you are constructing an array then > your time complexity is presumably at least k×n for some nontrivial k, > so that the extra few instructions × n are unlikely to make a big > difference to the overall program, however annoying they look "in the > small". > > ocaml already goes some way beyond what C++ considers acceptible > inefficiency. That's fine for a vast number of applications on modern > desktop hardware. Unfortunately, when writing an interpreter such as one for Python, reasonable efficieny plus some other advantages seem neccessary to get any users at all. JPython (Java implementation), for example, is 3 times slower than CPython (C implementation), and that was enough for me to discount it. In fact, I am writing a python interpreter because CPython is vastly (several THOUSAND times) too slow for what I need: extremely fast string operations are required to support Interscript, my literate programming tool. Example: Interscript includes documentation for ISO10646 (unicode) characters, the complete build of interscript itself, which is written in interscript, takes several HOURS on my 120Mhz pentium: the generated documentation is around 5Meg, mainly consisting of the character code tables. This is mainly because I have to do things like convert a string or plain text to HTML, which requires replacing characters '<' with '<'; that is, scan each individual character .. in an interpretive loop. It may seem strange to believe I can actually achieve this performance in ocaml! But I do! Because of the 'higher level algorithms are easier to write in ocaml than C' factor, optimisations using techniques such as type inference, special pattern recognition, etc, and generation of fast C or assembler, are possible. FYI the design is to use the interpreter only to load the modules required for a main program, then do whole program analysis to generate a single executable. But because Python also has dynamic features like 'exec string' it is necessary to make the interpreter available at run time, even in the compiled code. So it needs to be reasonably efficient. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Stdlib regularity 1999-10-11 0:36 ` skaller @ 1999-10-12 7:20 ` David Mentr{'e} 0 siblings, 0 replies; 22+ messages in thread From: David Mentr{'e} @ 1999-10-12 7:20 UTC (permalink / raw) To: skaller; +Cc: William Chesters, caml-list skaller <skaller@maxtal.com.au> writes: > This is mainly because I have to do things like convert > a string or plain text to HTML, which requires replacing > characters '<' with '<'; that is, scan each individual > character .. in an interpretive loop. It could be choking for you, especially in a Caml mailing-list, but have you: 1. consider using regular expressions? They are typically made for the kind of thing you are trying to do. And regex engines have optimization. 2. consider using the Perl language? Is far from perfect, not very clean (in the Caml way at least (in other way either ;)) but has a very powerful bultin regex engine. Even without Perl, OCaml has a regex library I think. Any way, I think you should have a look at _Mastering Regular Expressions_ O'Reilly book. Best regards, d. -- David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/ Opinions expressed here are only mine. ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~1999-10-14 12:56 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1999-10-12 16:21 Stdlib regularity Damien Doligez 1999-10-13 12:18 ` Matías Giovannini -- strict thread matches above, loose matches on Subject: below -- 1999-10-09 16:58 William Chesters 1999-10-10 0:11 ` Matías Giovannini 1999-10-06 13:25 Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl 1999-10-08 14:06 ` Matías Giovannini 1999-10-10 20:09 ` Pierre Weis 1999-10-10 20:12 ` Matías Giovannini 1999-10-08 14:10 ` skaller 1999-10-08 19:21 ` Markus Mottl 1999-10-09 21:14 ` Dave Mason 1999-10-06 18:50 ` John Prevost 1999-10-07 7:33 ` skaller 1999-10-07 9:18 ` Francisco Valverde Albacete 1999-10-08 14:56 ` skaller 1999-10-09 22:26 ` Francois Rouaix 1999-10-10 5:38 ` skaller 1999-10-10 20:44 ` William Chesters 1999-10-10 21:43 ` Hongwei Xi 1999-10-11 0:36 ` skaller 1999-10-12 7:20 ` David Mentr{'e}
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox