* [Caml-list] GC and file descriptors @ 2003-11-13 0:50 Dustin Sallings 2003-11-13 1:18 ` David Fox ` (2 more replies) 0 siblings, 3 replies; 92+ messages in thread From: Dustin Sallings @ 2003-11-13 0:50 UTC (permalink / raw) To: Caml Mailing List I keep ending up with somewhat awkward structures to close files I've opened. Does the GC take care of any of this type of thing? I.e., could the following: let tput x = let r = Unix.open_process_in ("tput " ^ x) and buf = String.create 8 in let len = input r buf 0 8 in close_in r; String.sub buf 0 len ;; be written like this: let tput x = let buf = String.create 8 in String.sub buf 0 (input (Unix.open_process_in ("tput " ^ x)) buf 0 8) ;; safely? -- Dustin Sallings ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-13 0:50 [Caml-list] GC and file descriptors Dustin Sallings @ 2003-11-13 1:18 ` David Fox 2003-11-13 4:09 ` Dustin Sallings 2003-11-13 1:19 ` [Caml-list] " Nicolas George [not found] ` <87smkstkhg.fsf@igloo.phubuh.org> 2 siblings, 1 reply; 92+ messages in thread From: David Fox @ 2003-11-13 1:18 UTC (permalink / raw) To: Dustin Sallings; +Cc: Caml Mailing List I might start with this let for_process_output f cmd = let r = Unix.open_process_in cmd in let result = f r in Unix.close_process_in r; result then let tput x = for_process_output (fun r -> input r buf 0 8) ("tput " ^ x) at which point you start thinking about exception handling... Dustin Sallings wrote: > > I keep ending up with somewhat awkward structures to close files > I've opened. Does the GC take care of any of this type of thing? > > I.e., could the following: > > let tput x = > let r = Unix.open_process_in ("tput " ^ x) > and buf = String.create 8 in > let len = input r buf 0 8 in > close_in r; > String.sub buf 0 len > ;; > > be written like this: > > let tput x = > let buf = String.create 8 in > String.sub buf 0 (input (Unix.open_process_in ("tput " ^ x)) buf 0 8) > ;; > > safely? > ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-13 1:18 ` David Fox @ 2003-11-13 4:09 ` Dustin Sallings 2003-11-14 13:42 ` Damien Doligez 0 siblings, 1 reply; 92+ messages in thread From: Dustin Sallings @ 2003-11-13 4:09 UTC (permalink / raw) To: David Fox; +Cc: Caml Mailing List > Dustin Sallings wrote: > >> >> I keep ending up with somewhat awkward structures to close files >> I've opened. Does the GC take care of any of this type of thing? On Nov 12, 2003, at 17:18, David Fox wrote: > I might start with this > > let for_process_output f cmd = > let r = Unix.open_process_in cmd in > let result = f r in > Unix.close_process_in r; > result > > then > > let tput x = for_process_output (fun r -> input r buf 0 8) ("tput " ^ > x) > > at which point you start thinking about exception handling... Well, I didn't see the close_process thing, but that made things a bit worse. My goal is to capture the output of a command with each of two args. I currently have this: exception SubprocessError of string;; (* Get a string from the tput command. *) let tput x = let r = Unix.open_process_in ("tput " ^ x) and buf = String.create 8 in let len = input r buf 0 8 in let status = Unix.close_process_in r in begin match status with Unix.WEXITED(st) -> if (st != 0) then raise (SubprocessError "Non-zero exit status"); | _ -> raise (SubprocessError "Unknown error."); end; String.sub buf 0 len ;; (* The two characters we need. *) let smso = tput "smso";; let rmso = tput "rmso";; This is to give me the equivalent of the following bourne shell: smso=`tput smso` rmso=`tput rmso` Garbage collecting the subprocesses and file descriptors would make this a bit more straightforward. -- SPY My girlfriend asked me which one I like better. pub 1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net> | Key fingerprint = 87 02 57 08 02 D0 DA D6 C8 0F 3E 65 51 98 D8 BE L_______________________ I hope the answer won't upset her. ____________ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-13 4:09 ` Dustin Sallings @ 2003-11-14 13:42 ` Damien Doligez 2003-11-14 14:57 ` Christophe Raffalli 2003-11-14 18:35 ` Dustin Sallings 0 siblings, 2 replies; 92+ messages in thread From: Damien Doligez @ 2003-11-14 13:42 UTC (permalink / raw) To: Caml Mailing List > Garbage collecting the subprocesses and file descriptors would make > this a bit more straightforward. The problem with this approach is that the "close" and "wait" system calls have side effects. I don't like the idea of a GC that has side effects other than the memory size of the program (and finalisation functions, of course). -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 13:42 ` Damien Doligez @ 2003-11-14 14:57 ` Christophe Raffalli 2003-11-14 20:24 ` Dmitry Bely 2003-11-17 14:19 ` Damien Doligez 2003-11-14 18:35 ` Dustin Sallings 1 sibling, 2 replies; 92+ messages in thread From: Christophe Raffalli @ 2003-11-14 14:57 UTC (permalink / raw) To: Damien Doligez; +Cc: Caml Mailing List Damien Doligez wrote: >> Garbage collecting the subprocesses and file descriptors would >> make this a bit more straightforward. > > > The problem with this approach is that the "close" and "wait" > system calls have side effects. I don't like the idea of a GC > that has side effects other than the memory size of the program > (and finalisation functions, of course). Why ? Could you elaborate ? -- Christophe Raffalli Université de Savoie Batiment Le Chablais, bureau 21 73376 Le Bourget-du-Lac Cedex tél: (33) 4 79 75 81 03 fax: (33) 4 79 75 87 42 mail: Christophe.Raffalli@univ-savoie.fr www: http://www.lama.univ-savoie.fr/~RAFFALLI --------------------------------------------- IMPORTANT: this mail is signed using PGP/MIME At least Enigmail/Mozilla, mutt or evolution can check this signature --------------------------------------------- ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 14:57 ` Christophe Raffalli @ 2003-11-14 20:24 ` Dmitry Bely 2003-11-14 20:54 ` Eric Dahlman ` (2 more replies) 2003-11-17 14:19 ` Damien Doligez 1 sibling, 3 replies; 92+ messages in thread From: Dmitry Bely @ 2003-11-14 20:24 UTC (permalink / raw) To: caml-list Christophe Raffalli <Christophe.Raffalli@univ-savoie.fr> writes: >>> Garbage collecting the subprocesses and file descriptors would >>> make this a bit more straightforward. >> The problem with this approach is that the "close" and "wait" >> system calls have side effects. I don't like the idea of a GC >> that has side effects other than the memory size of the program >> (and finalisation functions, of course). > > Why ? Could you elaborate ? close() can fail (generate an exception). How are you going to handle it in the garbage collector? On the contrary, the memory deallocaton always succeeds. - Dmitry Bely ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 20:24 ` Dmitry Bely @ 2003-11-14 20:54 ` Eric Dahlman 2003-11-14 22:21 ` Brian Hurt 2003-11-14 21:48 ` Brian Hurt 2003-11-15 2:25 ` Max Kirillov 2 siblings, 1 reply; 92+ messages in thread From: Eric Dahlman @ 2003-11-14 20:54 UTC (permalink / raw) To: caml-list Dmitry Bely wrote: > Christophe Raffalli <Christophe.Raffalli@univ-savoie.fr> writes: > > >>>> Garbage collecting the subprocesses and file descriptors would >>>>make this a bit more straightforward. >>> >>>The problem with this approach is that the "close" and "wait" >>>system calls have side effects. I don't like the idea of a GC >>>that has side effects other than the memory size of the program >>>(and finalisation functions, of course). >> >>Why ? Could you elaborate ? > > close() can fail (generate an exception). How are you going to handle it in > the garbage collector? On the contrary, the memory deallocaton always > succeeds. Another problem with doing this is that you are tying the rate at which file descriptors are closed to the GC profile of your program. If you only have 10's or 100's of file descriptors available it would be very easy to use them all up in a tight loop which did not allocate much memory. It would be a bummer to have your program start failing because you used a more memory efficient algorithm for some calculation. -Eric ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 20:54 ` Eric Dahlman @ 2003-11-14 22:21 ` Brian Hurt 2003-11-14 21:36 ` John J Lee 0 siblings, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-14 22:21 UTC (permalink / raw) To: Eric Dahlman; +Cc: caml-list On Fri, 14 Nov 2003, Eric Dahlman wrote: > Another problem with doing this is that you are tying the rate at which > file descriptors are closed to the GC profile of your program. If you > only have 10's or 100's of file descriptors available it would be very > easy to use them all up in a tight loop which did not allocate much memory. > > It would be a bummer to have your program start failing because you used > a more memory efficient algorithm for some calculation. This is easy enough to work around- have the file open try a GC if the file open fails. Alternatively, you could simply require programs that want to sit in tight loops opening descriptors to handle the out of descriptors case. Java does #2, IIRC. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 22:21 ` Brian Hurt @ 2003-11-14 21:36 ` John J Lee 0 siblings, 0 replies; 92+ messages in thread From: John J Lee @ 2003-11-14 21:36 UTC (permalink / raw) To: caml-list On Fri, 14 Nov 2003, Brian Hurt wrote: > On Fri, 14 Nov 2003, Eric Dahlman wrote: [...] > > file descriptors are closed to the GC profile of your program. If you > > only have 10's or 100's of file descriptors available it would be very > > easy to use them all up in a tight loop which did not allocate much memory. [...] > This is easy enough to work around- have the file open try a GC if the > file open fails. Alternatively, you could simply require programs that [...] File descriptors can be a globally limited resource, though, can't they? If your program uses them all up, other programs can't get one either. Also, does O'Caml *always* GC objects? If they're un-GC-ed at time of program exit, will they still be GC-ed, or does O'Caml just leave it to the OS to clean up in that case? If it leaves it to the OS, files could be left open, and I'm not sure it's guaranteed that having files open for writing at program exit won't cause data loss. John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 20:24 ` Dmitry Bely 2003-11-14 20:54 ` Eric Dahlman @ 2003-11-14 21:48 ` Brian Hurt 2003-11-15 1:47 ` Dmitry Bely 2003-11-15 2:25 ` Max Kirillov 2 siblings, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-14 21:48 UTC (permalink / raw) To: Dmitry Bely; +Cc: caml-list On Fri, 14 Nov 2003, Dmitry Bely wrote: > close() can fail (generate an exception). How are you going to handle it in > the garbage collector? On the contrary, the memory deallocaton always > succeeds. Two choices: either abort the whole program (uncaught exception), or ignore it. Ocaml's life is a little easier, as GC takes place inside the single thread of execution. But I dislike making the language spec require that. Now, *which* of those two choices is "correct" is a matter for some debate. Myself, I like aborting the whole program. What I dislike is resource leaks- as plugging the holes manually can often be tricky. If I wanted to hand-manage resource deallocation, I'd be programming in C (well, not quite- but the point is made). -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 21:48 ` Brian Hurt @ 2003-11-15 1:47 ` Dmitry Bely 0 siblings, 0 replies; 92+ messages in thread From: Dmitry Bely @ 2003-11-15 1:47 UTC (permalink / raw) To: caml-list Brian Hurt <bhurt@spnz.org> writes: >> close() can fail (generate an exception). How are you going to handle it in >> the garbage collector? On the contrary, the memory deallocaton always >> succeeds. > > Two choices: either abort the whole program (uncaught exception), or > ignore it. Ocaml's life is a little easier, as GC takes place inside the > single thread of execution. But I dislike making the language spec > require that. > > Now, *which* of those two choices is "correct" is a matter for some > debate. Myself, I like aborting the whole program. I think none of them. Imagine that you access a file over the network. Now, a network error can suddenly terminate your program (you users will be very surprised) or can be missed that leads to the possible loss of data (they will be surprised also). Generally speaking, GC is not suitable for resource deallocation - this should be done synchroniously (generating and catching exceptions). > What I dislike is resource leaks- as plugging the holes manually can often > be tricky. If I wanted to hand-manage resource deallocation, I'd be > programming in C (well, not quite- but the point is made). Why not to use high-order functions? http://caml.inria.fr/archives/200212/msg00132.html What is the problem with them? - Dmitry Bely ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 20:24 ` Dmitry Bely 2003-11-14 20:54 ` Eric Dahlman 2003-11-14 21:48 ` Brian Hurt @ 2003-11-15 2:25 ` Max Kirillov 2003-11-15 2:49 ` Mike Furr 2003-11-15 2:58 ` [Caml-list] GC and file descriptors David Brown 2 siblings, 2 replies; 92+ messages in thread From: Max Kirillov @ 2003-11-15 2:25 UTC (permalink / raw) To: caml-list On Fri, Nov 14, 2003 at 11:24:21PM +0300, Dmitry Bely wrote: > Christophe Raffalli <Christophe.Raffalli@univ-savoie.fr> writes: > >>>> Garbage collecting the subprocesses and file descriptors would >>>> make this a bit more straightforward. >>> The problem with this approach is that the "close" and "wait" >>> system calls have side effects. I don't like the idea of a GC >>> that has side effects other than the memory size of the program >>> (and finalisation functions, of course). >> >> Why ? Could you elaborate ? > > close() can fail (generate an exception). How are you going to handle it in According to 'man 2 close', it cannot. Nearly the same is true for any resource deallocation. The only exception I can recall is wait -- but, when you don't need the exit status (and you surely don't, once you leave it to GC), you can ignore all the errors. Of course here might be a questions wish unflushed data. But still, if you don't flush the writing channel before finishing a function, you probably don't need you output to be delivered in any case. > the garbage collector? On the contrary, the memory deallocaton always > succeeds. > - Dmitry Bely > > > ------------------- > To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr > Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > -- Max ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-15 2:25 ` Max Kirillov @ 2003-11-15 2:49 ` Mike Furr 2003-11-16 4:09 ` [Caml-list] Bugs from ignoring errors from close (was Re: GC and file..) Tim Freeman 2003-11-15 2:58 ` [Caml-list] GC and file descriptors David Brown 1 sibling, 1 reply; 92+ messages in thread From: Mike Furr @ 2003-11-15 2:49 UTC (permalink / raw) To: caml-list [-- Attachment #1: Type: text/plain, Size: 427 bytes --] On Fri, 2003-11-14 at 21:25, Max Kirillov wrote: > According to 'man 2 close', it cannot. Sure it can: EINTR. And to prove this can actually can happen, just ask the squid(web proxy) developers. I seem to recall hearing a talk by one of them where they mentioned the failure to check the return code of close() resulted in a out-of-fd bug which took a _long_ time to track down. -- Mike Furr <mike.furr@umbc.edu> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 92+ messages in thread
* [Caml-list] Bugs from ignoring errors from close (was Re: GC and file..) 2003-11-15 2:49 ` Mike Furr @ 2003-11-16 4:09 ` Tim Freeman 0 siblings, 0 replies; 92+ messages in thread From: Tim Freeman @ 2003-11-16 4:09 UTC (permalink / raw) To: caml-list On Fri, 2003-11-14 at 21:25, Max Kirillov wrote: > According to 'man 2 close', it cannot. From: Mike Furr <mike.furr@umbc.edu> >Sure it can: EINTR. And to prove this can actually can happen, >just ask the squid(web proxy) developers. I seem to recall hearing a >talk by one of them where they mentioned the failure to check the return >code of close() resulted in a out-of-fd bug which took a _long_ time to >track down. In another context I've solved a bug that was hard to find in part because someone ignored the return status from close. Here's the scenario: Thread 1 Thread 2 1. int fd1 = open (...) 2. close(fd1) 3. int fd2 = open (...) 4. bad code called close (fd1) again, ignoring errors. 5. read (fd2, ...) and occasionally fail! When the steps happen to happen in this order, fd2 can equal fd1, so the second close closes fd2 also and the read occasionally failed. Don't ignore errors on any system calls. -- Tim Freeman tim@fungible.com GPG public key fingerprint ECDF 46F8 3B80 BB9E 575D 7180 76DF FE00 34B1 5C78 Your levity is good. It relieves tension and the fear of death. -- Terminator 3 ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-15 2:25 ` Max Kirillov 2003-11-15 2:49 ` Mike Furr @ 2003-11-15 2:58 ` David Brown 1 sibling, 0 replies; 92+ messages in thread From: David Brown @ 2003-11-15 2:58 UTC (permalink / raw) To: caml-list On Sat, Nov 15, 2003 at 08:25:21AM +0600, Max Kirillov wrote: > > close() can fail (generate an exception). How are you going to handle it in > > According to 'man 2 close', it cannot. It can fail with either EINTR, or EIO, at least on Linux. Dave ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 14:57 ` Christophe Raffalli 2003-11-14 20:24 ` Dmitry Bely @ 2003-11-17 14:19 ` Damien Doligez 2003-11-17 18:18 ` skaller 1 sibling, 1 reply; 92+ messages in thread From: Damien Doligez @ 2003-11-17 14:19 UTC (permalink / raw) To: Caml Mailing List On Friday, November 14, 2003, at 03:57 PM, Christophe Raffalli wrote: > Why ? Could you elaborate ? This is what I had in mind: If your file descriptor f is bound to a pipe, then close() will send a SIGPIPE or and end-of-file to the process at the other end of the pipe. This is a visible side-effect. If f is bound to a socket, then close() will not only send and end-of-file to the other end, but it will also release the port number, changing the outcome of a further bind() call. Another side-effect. If you call exec(), then the new program will inherit all open file descriptors. More visible side-effects. More examples (that I hadn't thought of) were given by other people on this list. For these reasons, close() must be provided to the programmer for explicit calling, because its semantics goes beyond simple deallocation of resource. Since close() is provided, if you want the GC to close your descriptors, you can attach finalisation functions to them. In that case, don't forget to call Gc.full_major() before exec(). In my opinion, doing it automatically would only encourage sloppy programming by hiding errors in a non-deterministic way. -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-17 14:19 ` Damien Doligez @ 2003-11-17 18:18 ` skaller 0 siblings, 0 replies; 92+ messages in thread From: skaller @ 2003-11-17 18:18 UTC (permalink / raw) To: Damien Doligez; +Cc: Caml Mailing List On Tue, 2003-11-18 at 01:19, Damien Doligez wrote: > On Friday, November 14, 2003, at 03:57 PM, Christophe Raffalli wrote: > In my opinion, doing it automatically would only encourage sloppy > programming by hiding errors in a non-deterministic way. Perhaps. But we *want* to encourage a closely related kind: the lazy programmer. Lazy is good. It's what machines are *for*. I do agree with you but the distinction is subtle. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 13:42 ` Damien Doligez 2003-11-14 14:57 ` Christophe Raffalli @ 2003-11-14 18:35 ` Dustin Sallings 2003-11-15 14:16 ` skaller 1 sibling, 1 reply; 92+ messages in thread From: Dustin Sallings @ 2003-11-14 18:35 UTC (permalink / raw) To: Damien Doligez; +Cc: Caml Mailing List On Nov 14, 2003, at 5:42, Damien Doligez wrote: >> Garbage collecting the subprocesses and file descriptors would make >> this a bit more straightforward. > > The problem with this approach is that the "close" and "wait" > system calls have side effects. I don't like the idea of a GC > that has side effects other than the memory size of the program > (and finalisation functions, of course). But what is the purpose of a finalization function that doesn't have a side effect? They are defined as (;a -> unit) which implies to me that the only purpose of the thing is to produce a side effect. close() seems to be conceptually the same finalization as free() to me. It releases a system resource back to the process that would otherwise be unavailable forever. -- Dustin Sallings ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-14 18:35 ` Dustin Sallings @ 2003-11-15 14:16 ` skaller 2003-11-15 15:56 ` Ville-Pertti Keinonen 0 siblings, 1 reply; 92+ messages in thread From: skaller @ 2003-11-15 14:16 UTC (permalink / raw) To: Dustin Sallings; +Cc: Damien Doligez, Caml Mailing List On Sat, 2003-11-15 at 05:35, Dustin Sallings wrote: > close() seems to be conceptually the same finalization as free() to > me. It releases a system resource back to the process that would > otherwise be unavailable forever. This isn't the case though. The problem is that system resources *other than memory* are represented in memory. So finalising the memory/executing a destructor causes releasing the resource *other than memory* by way of the memory representation, but memory itself is released directly by the gc -- no need to delete what a pointer in memory points to in a destructor, and indeed unsafe in most cases. In other words .. finalisation functions may be part of the abstraction which a representation of a system resource in memory entails, but collectable memory is not such a resource, even though it has such a representation (a pointer): memory is not abstract but concrete. Please can someone say this better? Anyhow, RAII (Resource Acquisition Is Initialisation) paradigm popular in C++ requires synchronous and orderly destruction which garbage collectors do not usually provide. Here synchronous means an object is destroyed *immediately* there are no reference to it, and orderly means in order of dependency (usually FIFO). It is possible to modify the Ocaml collector to make finalisation orderly, but it is very expensive. It is also possible to synchonise finalisation with explicit calls to the gc, although that isn't quite the same as synchronous destruction. Its easier, usually, to just explicitly release a resource. By the way I'm curious: in LablGTK, there will be finalisation functions for releasing GTK resources, since GTK also has them (Yes?) It isn't clear this will work in general, since the finalisation is neither synchronous nor orderly.. but both are required in general in a GUI. Python has this problem with Tkinter, which opens and creates things fine, but cannot handle release correctly -- this is because two way reflection is the only way to make that work (and it is *very* difficult to get right) It's a real pain, when you use foreign representations of GUI objects, and the GUI simply isn't strong enough in terms of notifications. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-15 14:16 ` skaller @ 2003-11-15 15:56 ` Ville-Pertti Keinonen 2003-11-15 17:30 ` skaller 0 siblings, 1 reply; 92+ messages in thread From: Ville-Pertti Keinonen @ 2003-11-15 15:56 UTC (permalink / raw) To: skaller; +Cc: Dustin Sallings, Damien Doligez, Caml Mailing List On Sun, Nov 16, 2003 at 01:16:08AM +1100, skaller wrote: > It is possible to modify the Ocaml collector to make finalisation > orderly, but it is very expensive. It is also possible Not just expensive, more like something you definitely don't want to even try. Python tries to have deterministic finalization, but falls back on "normal" garbage collection for containers (with potential cyclic references). Perl doesn't even bother with that, instead it doesn't resolve cyclic references at all. Typically, the best you can do is explicit finalization; which can be convenient using something similar to Lisp with-open-file constructs, possibly with alternate interfaces and finalization on GC as a fallback for poorly structured programs. For more complicated cases with multiple references, what you probably want is an explicit liveness count (separate from the GC/memory references). The liveness count can likewise be protected by something similar to unwind-protect blocks. Considering that OCaml doesn't have capturable continuations, things are relatively simple. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-15 15:56 ` Ville-Pertti Keinonen @ 2003-11-15 17:30 ` skaller 2003-11-15 20:31 ` Martin Berger 2003-11-16 19:19 ` Brian Hurt 0 siblings, 2 replies; 92+ messages in thread From: skaller @ 2003-11-15 17:30 UTC (permalink / raw) To: Ville-Pertti Keinonen; +Cc: Dustin Sallings, Damien Doligez, Caml Mailing List On Sun, 2003-11-16 at 02:56, Ville-Pertti Keinonen wrote: > On Sun, Nov 16, 2003 at 01:16:08AM +1100, skaller wrote: > > > It is possible to modify the Ocaml collector to make finalisation > > orderly, but it is very expensive. It is also possible > > Not just expensive, more like something you definitely don't want to > even try. Python tries to have deterministic finalization, but > falls back on "normal" garbage collection for containers Funny, but I actually *needed* deterministic finalisation. And the reason was .. I was implementing a Python interpreter in Ocaml, and needed to emulate Python's ref counting. But I didn't want to actually do any ref counting :-) Because of this problem, my own major Python program interscript would not run on my interpreter, since it relied on orderly finalisation. The reason for *that* is that it effectively executed commands written by the client, and the specification provided for opening and writing to files, but not for closing them. Closing files was important and had to be correctly timed: for example, closing the main document actually generated the table of contents (since only at the end are all the contents known). Anyhow .. there do exist cases where explicit finalisation is difficult. This was one of them (and was one of the main reasons I abandoned the project -- lack of support for stackless operation was the other). ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-15 17:30 ` skaller @ 2003-11-15 20:31 ` Martin Berger 2003-11-16 19:19 ` Brian Hurt 1 sibling, 0 replies; 92+ messages in thread From: Martin Berger @ 2003-11-15 20:31 UTC (permalink / raw) To: caml-list about finalisation vs GC, you might find the POPL'03 paper Destructors, finalizers, and synchronization, by Hans J Boehm pertinent. it is available at http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-15 17:30 ` skaller 2003-11-15 20:31 ` Martin Berger @ 2003-11-16 19:19 ` Brian Hurt 2003-11-17 18:15 ` skaller 1 sibling, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-16 19:19 UTC (permalink / raw) To: skaller Cc: Ville-Pertti Keinonen, Dustin Sallings, Damien Doligez, Caml Mailing List On 16 Nov 2003, skaller wrote: > On Sun, 2003-11-16 at 02:56, Ville-Pertti Keinonen wrote: > > On Sun, Nov 16, 2003 at 01:16:08AM +1100, skaller wrote: > > > > > It is possible to modify the Ocaml collector to make finalisation > > > orderly, but it is very expensive. It is also possible > > > > Not just expensive, more like something you definitely don't want to > > even try. Python tries to have deterministic finalization, but > > falls back on "normal" garbage collection for containers > > Funny, but I actually *needed* deterministic finalisation. > And the reason was .. I was implementing a Python interpreter > in Ocaml, and needed to emulate Python's ref counting. > But I didn't want to actually do any ref counting :-) You can't fake reference counting with mark-and-sweep. Here's the problem: reference counting is expensive. In a scripting language this performance hit may not be problem, it may be outweighed by other performance limitation. But every time some C++ programmers bitchs about garbage collection being slow, he's almost certainly talking about reference counting. I wish the mainstream would crawl out of the sixties era computer science and get at least into the eighties. The only "mainstream" programming language I know of with *excusable* garbage collection is Java. Of course, Java's type system is state of the art- for 1968. Which convinces everyone that strong typing is bad- in much the same way that reference counting convinces everyone that garbage collection is slow. Grr. > > Because of this problem, my own major Python program > interscript would not run on my interpreter, since > it relied on orderly finalisation. > > The reason for *that* is that it effectively executed > commands written by the client, and the specification > provided for opening and writing to files, but > not for closing them. Closing files was important > and had to be correctly timed: for example, closing > the main document actually generated the table of > contents (since only at the end are all the contents > known). You were abusing finalization. The fact that it worked (in reference counting Python) doesn't mean it was a good idea. Accidentally having dangling references which delay some objects finalization could introduce unexpected orderings and "bugs". A much better way to do this would be to have a central command queue. Every time a new command is created, it adds itself to the queue. Then you just flush the command queue- executing the commands in the order they were added. This is done every so often while the program is running and, of course, on program exit. > > Anyhow .. there do exist cases where explicit finalisation > is difficult. IIRC, in the general case it's equivelent to solving the halting problem. On a more general note, I will agree that having the GC free non-memory resources raises problems. I agree (and strongly beleive) that everything that isn't memory should have some way for the program to free them without invoking the GC- for example, file descriptors should have a close routine. And I agree that programs *should* use them instead of relying on the GC in all but the most intractable cases. The question is, is it worse to have the GC try to reclaim the resources than it is to have a guarenteed leak? Consider the case where close return EINTR and doesn't close the file descriptor. Since the descriptor has already leaked- for it to reach this point the program no longer has any way to reach the descriptor (if the program did, the object wouldn't be being collected). So what's the difference in this case between the collector silently ignoring the error, and the collector not even trying to free the resource? Actually, an idea I've had is to add Java-style "throws" information- basically what exceptions a function can throw- to the type information of a function. With type inference, this shouldn't be the headache in Ocaml it is in Java. The advantage here is that you could enlist the type system to gaurentee that a destructor doesn't throw any exceptions- i.e. the code for a destructor should handle all exceptions it generates. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-16 19:19 ` Brian Hurt @ 2003-11-17 18:15 ` skaller 2003-11-17 19:26 ` Aleksey Nogin ` (3 more replies) 0 siblings, 4 replies; 92+ messages in thread From: skaller @ 2003-11-17 18:15 UTC (permalink / raw) To: Brian Hurt; +Cc: Caml Mailing List On Mon, 2003-11-17 at 06:19, Brian Hurt wrote: > On 16 Nov 2003, skaller wrote: > I wish the mainstream would crawl out of the sixties era computer science > and get at least into the eighties. Lol, they're up to the sixties already? > The only "mainstream" programming > language I know of with *excusable* garbage collection is Java. I think Ocaml is very close to mainstream now. [And I think the FP contest and Bagley are responsible, and ORiley book in English will be the clincher] Very hard to be the second (and third) fastest programming language around and not be noticed :-) > Of > course, Java's type system is state of the art- for 1968. Err.. since when is downcasting everthing from Object a type system?? > > Because of this problem, my own major Python program > > interscript would not run on my interpreter, since > > it relied on orderly finalisation. > You were abusing finalization. Well no. I was *trying* to abuse it but failed :-) In fact, partly my prompting on this was responsible for Ocaml coded finalisers (instead of just C ones). > A much better way to do this would be to > have a central command queue. This would have been impossible in interscript. It works by executing user commands directly (but in a restricted environment). > > > > Anyhow .. there do exist cases where explicit finalisation > > is difficult. > > IIRC, in the general case it's equivelent to solving the halting problem. If I can make a general comment here: there are things like comparing functions which are equivalent to the halting problem. Another is verifying program correctness :-) But it is irrelevant, because programmers do not intend to write arbitrary code. They intend to write comprehensible code, whose correctness is therefore necessarily easy to prove. This is also true of finalisation. Necessarily. The general case is irrelevant, because people don't intend to write arbitrary code. Even when multiple people work on an Open Source project (just about the worst environment I can think of .. ) there is at least an intent to follow a design. > On a more general note, I will agree that having the GC free non-memory > resources raises problems. I agree (and strongly beleive) that everything > that isn't memory should have some way for the program to free them > without invoking the GC- for example, file descriptors should have a close > routine. And I agree that programs *should* use them instead of relying > on the GC in all but the most intractable cases. If I may modify that: *even* in the most intractible cases :-) However, two things spring to mind. First: if close is a primitive operation, we need a high level way of not only tracking what to close, but also of doing that tracking efficiently. Secondly, since the resource is represented in memory, and much of the time the dependencies which are *not* represented in memory could be, using the code of the garbage collector (and not just the same algorithm), makes some sense. > The question is, is it worse to have the GC try to reclaim the resources > than it is to have a guarenteed leak? You are making an incorrect assumption here. You're assuming that finalisers only release resources. Consider again the case where the final action in generating a document is to create a table of contents (know nnow that all chapters are seen). That is a quite non-trivial task which can take a lot of time and involve a very large amount of code, build new objects, etc. The time to do this operation is "when the user cannot add anymore chapters to the document" and the easiest time to specify that is "when the user no longer has any copy of the document handle". This finalisation task isn't a destructor and it isn't releasing anything, but it is, by specification, a finalisation task. > Consider the case where close > return EINTR and doesn't close the file descriptor. Since the descriptor > has already leaked- for it to reach this point the program no longer has > any way to reach the descriptor (if the program did, the object wouldn't > be being collected). So what's the difference in this case between the > collector silently ignoring the error, and the collector not even trying > to free the resource? Probably none, BUT you have again made an assumption and this time it is a fatal mistake. Closing a file is not usually done to free the resource, but to flush the buffers. In interscript, this proved difficult, because sometimes client code would expect to reopen and read the file after the last reference: unfortunately, the open actually works and reads 90% of the file (the last 10% still in a buffer). I fix that problem by always writing a temporary file, and copying it when the file is closed. So here, the finalisation is vital (it copies a file, as well as closing one). > Actually, an idea I've had is to add Java-style "throws" information- > basically what exceptions a function can throw- to the type information of > a function. With type inference, this shouldn't be the headache in Ocaml > it is in Java. The advantage here is that you could enlist the type > system to gaurentee that a destructor doesn't throw any exceptions- i.e. > the code for a destructor should handle all exceptions it generates. Sigh. I argued strongly in C++ that exception specifications should be statically enforced. They're not. Its a total pain, because they ALMOST are. Unfortunately, if you throw an exception you're not allowed to .. a run time check detects the fault and throw an Unexpected exception. You can rely on it .. and it totally defeats any attempt to do static analysis (that one leak can be turned into a complete defeat of the exception specifications). On thing is for sure .. I *hate* seeing "program terminated with uncaught "Not_found" exception" because I do hundreds of Hashtbl.find calls.... ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-17 18:15 ` skaller @ 2003-11-17 19:26 ` Aleksey Nogin 2003-11-18 13:49 ` skaller 2003-11-17 21:20 ` Brian Hurt ` (2 subsequent siblings) 3 siblings, 1 reply; 92+ messages in thread From: Aleksey Nogin @ 2003-11-17 19:26 UTC (permalink / raw) To: Caml Mailing List On 17.11.2003 10:15, skaller wrote: > On thing is for sure .. I *hate* seeing > > "program terminated with uncaught "Not_found" exception" > > because I do hundreds of Hashtbl.find calls.... Well, OCAMLRUNPARAM=b is your friend ;-) -- Aleksey Nogin Home Page: http://nogin.org/ E-Mail: nogin@cs.caltech.edu (office), aleksey@nogin.org (personal) Office: Jorgensen 70, tel: (626) 395-2907 ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-17 19:26 ` Aleksey Nogin @ 2003-11-18 13:49 ` skaller 2003-11-18 17:51 ` Dustin Sallings 2003-11-18 20:17 ` Aleksey Nogin 0 siblings, 2 replies; 92+ messages in thread From: skaller @ 2003-11-18 13:49 UTC (permalink / raw) To: Aleksey Nogin; +Cc: Caml Mailing List On Tue, 2003-11-18 at 06:26, Aleksey Nogin wrote: > On 17.11.2003 10:15, skaller wrote: > > > On thing is for sure .. I *hate* seeing > > > > "program terminated with uncaught "Not_found" exception" > > > > because I do hundreds of Hashtbl.find calls.... > > Well, OCAMLRUNPARAM=b is your friend ;-) That works with native code? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 13:49 ` skaller @ 2003-11-18 17:51 ` Dustin Sallings 2003-11-18 20:17 ` Aleksey Nogin 1 sibling, 0 replies; 92+ messages in thread From: Dustin Sallings @ 2003-11-18 17:51 UTC (permalink / raw) To: skaller; +Cc: Aleksey Nogin, Caml Mailing List On Nov 18, 2003, at 5:49, skaller wrote: > On Tue, 2003-11-18 at 06:26, Aleksey Nogin wrote: >> On 17.11.2003 10:15, skaller wrote: >> >>> On thing is for sure .. I *hate* seeing >>> >>> "program terminated with uncaught "Not_found" exception" >>> >>> because I do hundreds of Hashtbl.find calls.... >> >> Well, OCAMLRUNPARAM=b is your friend ;-) > > That works with native code? More importantly (to me), is the stack preserved when you have to re-raise an exception?: let operate_on_file f fn = let ch = open_in fn in try f ch; close_in ch; with x -> close_in ch; raise x ;; When this thing raises Not_found, will I be able to tell why? -- Dustin Sallings ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 13:49 ` skaller 2003-11-18 17:51 ` Dustin Sallings @ 2003-11-18 20:17 ` Aleksey Nogin 2003-11-20 7:36 ` Florian Hars 1 sibling, 1 reply; 92+ messages in thread From: Aleksey Nogin @ 2003-11-18 20:17 UTC (permalink / raw) To: skaller, Caml List On 18.11.2003 05:49, skaller wrote: >>Well, OCAMLRUNPARAM=b is your friend ;-) > > > That works with native code? No, unfortunately not. Any kind of debugginf in OCaml seems to require bytecode... -- Aleksey Nogin Home Page: http://nogin.org/ E-Mail: nogin@cs.caltech.edu (office), aleksey@nogin.org (personal) Office: Jorgensen 70, tel: (626) 395-2907 ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 20:17 ` Aleksey Nogin @ 2003-11-20 7:36 ` Florian Hars 0 siblings, 0 replies; 92+ messages in thread From: Florian Hars @ 2003-11-20 7:36 UTC (permalink / raw) To: Aleksey Nogin; +Cc: skaller, Caml List Aleksey Nogin wrote: > No, unfortunately not. Any kind of debugginf in OCaml seems to require > bytecode... Last time I looked, Printf.fprintf stderr ... worked in native code, too. Yours, Florian. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-17 18:15 ` skaller 2003-11-17 19:26 ` Aleksey Nogin @ 2003-11-17 21:20 ` Brian Hurt 2003-11-17 23:02 ` John J Lee ` (2 more replies) 2003-11-17 22:37 ` OCaml popularity [was: Re: [Caml-list] GC and file...] John J Lee 2003-11-18 1:02 ` [Caml-list] Re: GC and file descriptors Jed Davis 3 siblings, 3 replies; 92+ messages in thread From: Brian Hurt @ 2003-11-17 21:20 UTC (permalink / raw) To: skaller; +Cc: Caml Mailing List On 18 Nov 2003, skaller wrote: > On Mon, 2003-11-17 at 06:19, Brian Hurt wrote: > > On 16 Nov 2003, skaller wrote: > > > I wish the mainstream would crawl out of the sixties era computer science > > and get at least into the eighties. > > Lol, they're up to the sixties already? Pretty precisely, actually. Java's type system is quite comprable to Algol 68's- which is why downcasting from Object is such a common thing in Java. Generics are being added to the language, but they will still be "special" things (the Java designers are seemingly intent on turning Java into C++). And Java is the only language whose memory management is more advanced than 1968-era LISP. > > > The only "mainstream" programming > > language I know of with *excusable* garbage collection is Java. > > I think Ocaml is very close to mainstream now. > [And I think the FP contest and Bagley are responsible, > and ORiley book in English will be the clincher] The O'Reilly book will be a great benefit. Let me know when it comes out, I want a copy. But I don't know how close to mainstream it is. Perl, Python, and Ruby are scripting languages, still mainly used for short, single-person, throw-away projects. And they aren't that far from "conventional" languages (think Perl and Shell, Python/Ruby and Java). Ocaml is more of an applications language- it's benefits start to really shine when you're looking at tens of thousands (or more) lines of code. C++ succeed (and Objective-C and Smalltalk didn't) because you could write C in it. Java succeeded because IBM, Sun, Oracle, and a number of other huge companies got behind it. > > Very hard to be the second (and third) fastest programming > language around and not be noticed :-) Not that hard. All you need to have happen is to have dismissed as a "fruity, compsci language". > > > Of > > course, Java's type system is state of the art- for 1968. > > Err.. since when is downcasting everthing from Object > a type system?? It's a type system with a hole in it. Which I'll agree is almost as bad as no type system at all, but Java does have a type system (and DOS qualifies as an operating system, technically). > > > > Because of this problem, my own major Python program > > > interscript would not run on my interpreter, since > > > it relied on orderly finalisation. > > > You were abusing finalization. > > Well no. I was *trying* to abuse it but failed :-) > > In fact, partly my prompting on this was responsible > for Ocaml coded finalisers (instead of just C ones). > > > A much better way to do this would be to > > have a central command queue. > > This would have been impossible in interscript. > It works by executing user commands directly > (but in a restricted environment). Then you need to have an execute() function which is called as soon as the command can execute. I assumed you were trying to delay execution of the commands for a while. > > > > > > > Anyhow .. there do exist cases where explicit finalisation > > > is difficult. > > > > IIRC, in the general case it's equivelent to solving the halting problem. > > If I can make a general comment here: there are things like comparing > functions which are equivalent to the halting problem. Another is > verifying program correctness :-) > > But it is irrelevant, because programmers do not intend to write > arbitrary code. They intend to write comprehensible code, whose > correctness is therefore necessarily easy to prove. What they intend to do and what they do do are often completely different. It is a rare programmer who *intentionally* adds bugs to his code. Yet bugs still happen. But my point here is that determining the *exact* time that an object becomes garbage can be arbitrarily complex. Most of the time, I agree, it won't be. It's like factoring numbers- half of all numbers have 2 as a factor, and a third of the rest have 3 as a factor. So 2/3rds of all numbers are easy to get a factor out of. But there are still some numbers which are very hard to factor. Even reference counting can fail at this. It's possible for references to an object to "escape". The object is still dead, but it's not being collected for a while. Which is why depending on finalizations happening at a particular time, or in a particular order, is a bad idea. Note that the style/paradigm of code you are writting makes a huge difference. It's a lot easier to track object lifetimes in a procedural language than in an object oriented language. Once you allow for the fact that finalizers/destructors may not happen in a defined order or at defined times, why not go whole-hog? Especially given that it gives you a non-trivial performance boost. This performance boost is critical for functional languages like Ocaml, which has (compared to imperitive programs) a shockingly high allocation rate. I've seen studies that say that your average ML program allocates one word of memory every six program instructions. > > This is also true of finalisation. Necessarily. The general case is > irrelevant, because people don't intend to write arbitrary code. > > Even when multiple people work on an Open Source project (just about the > worst environment I can think of .. ) there is at least an intent to > follow a design. > > > On a more general note, I will agree that having the GC free non-memory > > resources raises problems. I agree (and strongly beleive) that everything > > that isn't memory should have some way for the program to free them > > without invoking the GC- for example, file descriptors should have a close > > routine. And I agree that programs *should* use them instead of relying > > on the GC in all but the most intractable cases. > > If I may modify that: *even* in the most intractible cases :-) In the most intractible cases, neither the compiler nor the programmer may be able to determine when the last reference is released. But even in the simple cases, the programmer may have the intent to release the resources and simply doesn't. It's called a bug. > > However, two things spring to mind. First: if close is a primitive > operation, we need a high level way of not only tracking > what to close, but also of doing that tracking efficiently. > > Secondly, since the resource is represented in memory, > and much of the time the dependencies which are *not* > represented in memory could be, using the code > of the garbage collector (and not just the same algorithm), > makes some sense. Why special-case close? It's what we've been talking about, but it could just as easily be "release a mutex", etc. > > > > The question is, is it worse to have the GC try to reclaim the resources > > than it is to have a guarenteed leak? > > You are making an incorrect assumption here. You're assuming > that finalisers only release resources. Consider again > the case where the final action in generating a document > is to create a table of contents (know nnow that all chapters > are seen). > > That is a quite non-trivial task which can take a lot > of time and involve a very large amount of code, > build new objects, etc. > > The time to do this operation is "when the user cannot > add anymore chapters to the document" and the easiest > time to specify that is "when the user no longer > has any copy of the document handle". > > This finalisation task isn't a destructor and it isn't > releasing anything, but it is, by specification, > a finalisation task. I wouldn't put that into a destructor. DON'T USE DESTRUCTORS FOR ENFORCING THE ORDER OF OPERATIONS. Destructors are safety nets. When all else fails, they have a shot at catching and fixing the mistakes. You're asking the language to support a seriously broken programming example. There is a huge difference between "creating an appendix" and "closing a file handle". > > > Consider the case where close > > return EINTR and doesn't close the file descriptor. Since the descriptor > > has already leaked- for it to reach this point the program no longer has > > any way to reach the descriptor (if the program did, the object wouldn't > > be being collected). So what's the difference in this case between the > > collector silently ignoring the error, and the collector not even trying > > to free the resource? > > Probably none, BUT you have again made an assumption and this > time it is a fatal mistake. Closing a file is not usually done > to free the resource, but to flush the buffers. > > In interscript, this proved difficult, because sometimes > client code would expect to reopen and read the file > after the last reference: unfortunately, the open actually > works and reads 90% of the file (the last 10% still in a buffer). > > I fix that problem by always writing a temporary file, and copying > it when the file is closed. So here, the finalisation is vital > (it copies a file, as well as closing one). And this fixes the problem how? If it matters when the buffers are flushed, manually flush the buffers. Even reference counting can fail if a reference to the object "leaks" (I'm being nice and assuming that circular references aren't a problem). > > > Actually, an idea I've had is to add Java-style "throws" information- > > basically what exceptions a function can throw- to the type information of > > a function. With type inference, this shouldn't be the headache in Ocaml > > it is in Java. The advantage here is that you could enlist the type > > system to gaurentee that a destructor doesn't throw any exceptions- i.e. > > the code for a destructor should handle all exceptions it generates. > > Sigh. I argued strongly in C++ that exception specifications should > be statically enforced. They're not. Its a total pain, because > they ALMOST are. Unfortunately, if you throw an exception you're > not allowed to .. a run time check detects the fault > and throw an Unexpected exception. You can rely on it .. and > it totally defeats any attempt to do static analysis (that one leak > can be turned into a complete defeat of the exception specifications). This seems par for the course for the C/C++ standards committee. Consider const and register as an example. > > On thing is for sure .. I *hate* seeing > > "program terminated with uncaught "Not_found" exception" > > because I do hundreds of Hashtbl.find calls.... > > There is a religous debate between returning 'a and throwing an exception if the item isn't found, or returning 'a option. Myself, I'd prefer 'a option *currently* because that enlists the typechecker to find errors. The counter argument is that this forces you to make a lot of unnecessary checks for finds you know will succeed. I'd rather have redundant error checks than uncaught errors, but that's me. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-17 21:20 ` Brian Hurt @ 2003-11-17 23:02 ` John J Lee 2003-11-18 12:05 ` Ville-Pertti Keinonen 2003-11-18 15:12 ` skaller 2 siblings, 0 replies; 92+ messages in thread From: John J Lee @ 2003-11-17 23:02 UTC (permalink / raw) To: Caml Mailing List On Mon, 17 Nov 2003, Brian Hurt wrote: [...] > The O'Reilly book will be a great benefit. Let me know when it comes out, It's coming out in print? Great! > I want a copy. But I don't know how close to mainstream it is. Perl, > Python, and Ruby are scripting languages, still mainly used for short, > single-person, throw-away projects. The fact that they're often *used* in that way doesn't preclude their use as applications languages -- far from it: they shine in that department (though I put Perl in a different category, not for any fundamental reason, but simply because it's so outrageously, unjustifiedly complicated). Perhaps O'Caml has even *more* to offer there, but dismissing Ruby & Python as "scripting" languages seems silly to me. People who write good unit tests (and even, which amazes me, people who don't) report very significant productivity improvements for applications of 10k or more when compared with Java and C++. They certainly don't even begin to have scaling problems at that point that aren't also present in Java and C++ (especially when you consider that lines of code are fewer in these languages than in poor-static-typing languages like Java). > And they aren't that far from "conventional" languages (think Perl and > Shell, Python/Ruby and Java). True in some ways. That certainly doesn't mean there isn't a very significant difference in productivity between the two groups, though ("traditional" and Python / Ruby -- with the caveat that I've never used Ruby, so can't comment on it). > Ocaml is more of an applications language- it's benefits start to really > shine when you're looking at tens of thousands (or more) lines of code. > C++ succeed (and Objective-C and Smalltalk didn't) because you could write > C in it. Java succeeded because IBM, Sun, Oracle, and a number of other > huge companies got behind it. [...] John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-17 21:20 ` Brian Hurt 2003-11-17 23:02 ` John J Lee @ 2003-11-18 12:05 ` Ville-Pertti Keinonen 2003-11-18 15:19 ` skaller ` (3 more replies) 2003-11-18 15:12 ` skaller 2 siblings, 4 replies; 92+ messages in thread From: Ville-Pertti Keinonen @ 2003-11-18 12:05 UTC (permalink / raw) To: Brian Hurt; +Cc: Caml Mailing List On Mon, Nov 17, 2003 at 03:20:36PM -0600, Brian Hurt wrote: > into C++). And Java is the only language whose memory management is more > advanced than 1968-era LISP. Did you forget to include the word "mainstream"? > I want a copy. But I don't know how close to mainstream it is. Perl, > Python, and Ruby are scripting languages, still mainly used for short, > single-person, throw-away projects. And they aren't that far from Python and Ruby are hardly scripting languages, even though they are often used as such. I think they could be decent general purpose programming languages except for a few unfortunate design decisions (such as scoping rules). > C in it. Java succeeded because IBM, Sun, Oracle, and a number of other > huge companies got behind it. Not just that, the OO hype is a huge factor. Faced with advocates who claim that subclassing is all you need and other language features are undesirable, it takes a while for inexperienced programmers - even smart ones - to become disillusioned and take the time to learn something different... It's difficult for programming languages to be judged on merit. People who are reasonably unbiased and know enough to be able to make informed comparisons aren't likely to consider any language or paradigm the "one true way". But not many people listen to advocates who don't claim that their solution is perfect. I'm fairly sure nobody on this list would claim that OCaml is above all other languages for every possible purpose. However, does anyone consider OCaml the best existing language for a particular use? Or just the most convenient implementation of the features needed? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 12:05 ` Ville-Pertti Keinonen @ 2003-11-18 15:19 ` skaller 2003-11-18 18:10 ` John J Lee 2003-11-18 20:02 ` Ville-Pertti Keinonen 2003-11-18 15:28 ` skaller ` (2 subsequent siblings) 3 siblings, 2 replies; 92+ messages in thread From: skaller @ 2003-11-18 15:19 UTC (permalink / raw) To: Ville-Pertti Keinonen; +Cc: Brian Hurt, Caml Mailing List On Tue, 2003-11-18 at 23:05, Ville-Pertti Keinonen wrote: > On Mon, Nov 17, 2003 at 03:20:36PM -0600, Brian Hurt wrote: > Python and Ruby are hardly scripting languages, even though they are > often used as such. I think they could be decent general purpose > programming languages except for a few unfortunate design decisions > (such as scoping rules). You haven't seen Python 2.2? Its a genuine functional programming language now, with full lexical scoping, closures, and even some advanced concepts like iterators which cannot be programmed in Ocaml. Stackless Python provides the full continuation passing (and Felix provides procedural continuations) so they're both ahead of Ocaml as functional languages on that score :-) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 15:19 ` skaller @ 2003-11-18 18:10 ` John J Lee 2003-11-18 17:55 ` skaller 2003-11-18 20:02 ` Ville-Pertti Keinonen 1 sibling, 1 reply; 92+ messages in thread From: John J Lee @ 2003-11-18 18:10 UTC (permalink / raw) To: Caml Mailing List On Tue, 19 Nov 2003, skaller wrote: > On Tue, 2003-11-18 at 23:05, Ville-Pertti Keinonen wrote: [...] > You haven't seen Python 2.2? Its a genuine functional > programming language now, with full lexical scoping, > closures, and even some advanced concepts like > iterators which cannot be programmed in Ocaml. [...] Don't you mean generators? John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 18:10 ` John J Lee @ 2003-11-18 17:55 ` skaller 0 siblings, 0 replies; 92+ messages in thread From: skaller @ 2003-11-18 17:55 UTC (permalink / raw) To: John J Lee; +Cc: Caml Mailing List On Wed, 2003-11-19 at 05:10, John J Lee wrote: > On Tue, 19 Nov 2003, skaller wrote: > > On Tue, 2003-11-18 at 23:05, Ville-Pertti Keinonen wrote: > [...] > > You haven't seen Python 2.2? Its a genuine functional > > programming language now, with full lexical scoping, > > closures, and even some advanced concepts like > > iterators which cannot be programmed in Ocaml. > [...] > > Don't you mean generators? Probably. You can write a function and yield a value in the middle.. and then later continue on from the yield point. The main use of such a beast is to implement container iterators using the closure state to maintain the position in the container. You can't do that (directly) in Ocaml. Of course you can write: let make_f () = let state = ref 0 in let f x = match !state with | 0 -> .... state := 6; retval1 | 1 -> ... in f but this is a real mess when you start nesting things... ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 15:19 ` skaller 2003-11-18 18:10 ` John J Lee @ 2003-11-18 20:02 ` Ville-Pertti Keinonen 2003-11-18 21:20 ` John J Lee 2003-11-19 12:25 ` skaller 1 sibling, 2 replies; 92+ messages in thread From: Ville-Pertti Keinonen @ 2003-11-18 20:02 UTC (permalink / raw) To: skaller; +Cc: Brian Hurt, Caml Mailing List On Wed, Nov 19, 2003 at 02:19:42AM +1100, skaller wrote: > You haven't seen Python 2.2? Its a genuine functional > programming language now, with full lexical scoping, > closures, and even some advanced concepts like > iterators which cannot be programmed in Ocaml. AFAIK (correct me if I'm wrong!) Python still doesn't have conventional lexical scoping. Each scope is a dictionary, and class scopes require explicit references (which is one of the most annoying things in Python in practice; when writing code including classes, the amount of self.-referneces more than makes up for the lack of let ... in expressions used in other languages in terms of the amount of typing involved). As for the functional part...considering that Python distinguishes between expressions and statements, and lambda-expressions are only permitted to include expressions, that's a fairly nasty limitation. With these features (plus dynamic typing and especially supposedly deterministic finalization, i.e. reference counting), Python is inevitably fairly inefficient, I doubt it can never achieve the efficiency of the best Common Lisp implementations (which are probably the most efficient possible implementations of dynamically typed languages, apart from the lack of continuations). Still, I think Python is one of the most interesting and capable popular languages, and I even use it at work. > Stackless Python provides the full continuation > passing (and Felix provides procedural continuations) > so they're both ahead of Ocaml as functional languages > on that score :-) Stackless Python is a very interesting concept. One of the things I'm interested in generally is how a continuation-based, stackless, natively compiled execution model could work out with modern programming languages. Continuations can be *very* powerful, and many implementations really screw up efficient implementation by using normal call stacks and copying them rather than using a continuation-based evaluation model...but then again continuations are a 70s thing, so they are way ahead of the mainstream according to Brian (and I agree!). ;-) One of the big downsides in OCaml is the lack of efficient concurrency (continuations are somewhat related to this). IMHO one of the most interesting near-future things in programming language research is how to best combine concurrency (efficient and scalable models, not shared-state concurrency as in C/C++/Java/OCaml, but something like Erlang or Oz) with static typing, state and efficient native compilation. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 20:02 ` Ville-Pertti Keinonen @ 2003-11-18 21:20 ` John J Lee 2003-11-19 12:25 ` skaller 1 sibling, 0 replies; 92+ messages in thread From: John J Lee @ 2003-11-18 21:20 UTC (permalink / raw) To: Caml Mailing List On Tue, 18 Nov 2003, Ville-Pertti Keinonen wrote: > On Wed, Nov 19, 2003 at 02:19:42AM +1100, skaller wrote: > > You haven't seen Python 2.2? Its a genuine functional > > programming language now, with full lexical scoping, > > closures, and even some advanced concepts like > > iterators which cannot be programmed in Ocaml. > > AFAIK (correct me if I'm wrong!) Python still doesn't have > conventional lexical scoping. (warning: I am far from a Python language lawyer) Python local scopes are not dictionaries. > Each scope is a dictionary, and class Class and global scopes are, yes. > scopes require explicit references (which is one of the most annoying > things in Python in practice; when writing code including classes, the > amount of self.-referneces more than makes up for the lack of let ... > in expressions used in other languages in terms of the amount of > typing involved). People using languages that don't have this seem to end up reinventing it. We see names like "m_foo" all the time in C++. Since self is just a conventional name of the first argument of Python methods, some projects (very few, admittedly) choose to use "s" instead of "self" (s.foo, not self.foo), and people still understand the code, since they use it consistently. > As for the functional part...considering that Python distinguishes > between expressions and statements, and lambda-expressions are only > permitted to include expressions, that's a fairly nasty limitation. Yes, I agree. I don't think Python puts much emphasis on functional ways of doing things, so it's only "nasty" if a functional language is what you're looking for (and perhaps we *should* all be looking for that, I dunno). > With these features (plus dynamic typing and especially supposedly > deterministic finalization, i.e. reference counting), Python is > inevitably fairly inefficient, I doubt it can never achieve the > efficiency of the best Common Lisp implementations (which are > probably the most efficient possible implementations of dynamically > typed languages, apart from the lack of continuations). I would certainly expect so. Of course, Python programmers rarely give a damn about that. :-) Still, the PyPy project (with Armin Rigo of Pysco fame on the team) has been making grand claims about execution speed, and a little of that vapour has condensed to the point where they have detailed plans (for a project at this stage), which were required for a pending application for EU funding, and a running interpreter in some larval state, ready or almost-ready for pre-alpha distribution, I think... ie, still vapourware ;-) John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 20:02 ` Ville-Pertti Keinonen 2003-11-18 21:20 ` John J Lee @ 2003-11-19 12:25 ` skaller 2003-11-19 13:55 ` Ville-Pertti Keinonen 1 sibling, 1 reply; 92+ messages in thread From: skaller @ 2003-11-19 12:25 UTC (permalink / raw) To: Ville-Pertti Keinonen; +Cc: Brian Hurt, Caml Mailing List On Wed, 2003-11-19 at 07:02, Ville-Pertti Keinonen wrote: > On Wed, Nov 19, 2003 at 02:19:42AM +1100, skaller wrote: > > You haven't seen Python 2.2? Its a genuine functional > > programming language now, with full lexical scoping, > > closures, and even some advanced concepts like > > iterators which cannot be programmed in Ocaml. > > AFAIK (correct me if I'm wrong!) Python still doesn't have > conventional lexical scoping. You're wrong :-) Functions can be nested in Python 2.2, and they're lexically scoped. You can even do this: def f(): x = 1 def g(): return x return g and it will work correctly: closures work fine. What doesn't work the way *I* would expect is that class scopes are not lexically scoped. I had an argument with Guido on that -- it would break some arcane hacks he said. > Each scope is a dictionary, No. Function scopes are basically static, there's no locals() dictionary. More precisely there is, but no declared (manifest) local variables are ever in it. > > Stackless Python provides the full continuation > > passing (and Felix provides procedural continuations) > > so they're both ahead of Ocaml as functional languages > > on that score :-) > > Stackless Python is a very interesting concept. One of the things > I'm interested in generally is how a continuation-based, stackless, > natively compiled execution model could work out with modern > programming languages. That's Felix:-) [Well, it uses procedural resumptions, functions just use the stack though -- i might fix that if a few people join the project and think it would be useful: it's not done for efficiency reasons] http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 12:25 ` skaller @ 2003-11-19 13:55 ` Ville-Pertti Keinonen 2003-11-19 14:26 ` Samuel Lacas 2003-11-19 14:47 ` skaller 0 siblings, 2 replies; 92+ messages in thread From: Ville-Pertti Keinonen @ 2003-11-19 13:55 UTC (permalink / raw) To: skaller; +Cc: Caml Mailing List On Wed, Nov 19, 2003 at 11:25:53PM +1100, skaller wrote: > Functions can be nested in Python 2.2, and they're lexically > scoped. You can even do this: I wouldn't call it conventional lexical scoping considering that the following is an error: x = 1 def f(): x += 1 return x I very much prefer having explicit let (and let rec) constructs like in OCaml. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 13:55 ` Ville-Pertti Keinonen @ 2003-11-19 14:26 ` Samuel Lacas 2003-11-19 14:47 ` skaller 1 sibling, 0 replies; 92+ messages in thread From: Samuel Lacas @ 2003-11-19 14:26 UTC (permalink / raw) To: Caml Mailing List Ville-Pertti Keinonen wrote: > On Wed, Nov 19, 2003 at 11:25:53PM +1100, skaller wrote: [snip] > I wouldn't call it conventional lexical scoping considering that the > following is an error: > > x = 1 > def f(): > x += 1 > return x > > I very much prefer having explicit let (and let rec) constructs like in > OCaml. Yes, but the following works: x = 1 def f(): global x # needed to reference x x += 1 return x Thus, everything is just in the "conventional". Different languages seems to have different conventions. sL ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 13:55 ` Ville-Pertti Keinonen 2003-11-19 14:26 ` Samuel Lacas @ 2003-11-19 14:47 ` skaller 1 sibling, 0 replies; 92+ messages in thread From: skaller @ 2003-11-19 14:47 UTC (permalink / raw) To: Ville-Pertti Keinonen; +Cc: Caml Mailing List On Thu, 2003-11-20 at 00:55, Ville-Pertti Keinonen wrote: > On Wed, Nov 19, 2003 at 11:25:53PM +1100, skaller wrote: > > Functions can be nested in Python 2.2, and they're lexically > > scoped. You can even do this: > > I wouldn't call it conventional lexical scoping considering that the > following is an error: > > x = 1 > def f(): > x += 1 > return x > > I very much prefer having explicit let (and let rec) constructs like in > OCaml. > But the problem has nothing to do with lexical scoping. The problem here is that variables are not declared in Python. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 12:05 ` Ville-Pertti Keinonen 2003-11-18 15:19 ` skaller @ 2003-11-18 15:28 ` skaller 2003-11-18 18:00 ` John J Lee 2003-11-18 22:28 ` Brian Hurt 3 siblings, 0 replies; 92+ messages in thread From: skaller @ 2003-11-18 15:28 UTC (permalink / raw) To: Ville-Pertti Keinonen; +Cc: Brian Hurt, Caml Mailing List On Tue, 2003-11-18 at 23:05, Ville-Pertti Keinonen wrote: > On Mon, Nov 17, 2003 at 03:20:36PM -0600, Brian Hurt wrote: > However, does anyone consider OCaml the best existing language for a > particular use? Or just the most convenient implementation of the > features needed? How can one know? I don't know *every* language :-) All I can say is -- Ocaml does many things I want to do very easily, so easily that I have found no pressing need to learn Haskell, or to write much code in any other language I do know. But I'm biased. Ocaml is the only language I know with strong FP support (I exclude Python due to lack of static typing). There are some things I find would be better if I could bind to C/C++ more easily: there is a lot of that stuff out there. But I find the best solution to that is to design and implement my own language (Felix) rather than use any existing one. Oh yeah, the implementation is written largely in .. Ocaml .. and quite a bit of the design is stolen straight from Ocaml I like it so much :-) Now, one other thing I wanted to play with was message passing and async processing .. and JoCaml seemed the best choice .. oh, that's an Ocaml derivative .. :-) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 12:05 ` Ville-Pertti Keinonen 2003-11-18 15:19 ` skaller 2003-11-18 15:28 ` skaller @ 2003-11-18 18:00 ` John J Lee 2003-11-18 22:28 ` Brian Hurt 3 siblings, 0 replies; 92+ messages in thread From: John J Lee @ 2003-11-18 18:00 UTC (permalink / raw) To: Caml Mailing List On Tue, 18 Nov 2003, Ville-Pertti Keinonen wrote: [...] > It's difficult for programming languages to be judged on merit. People > who are reasonably unbiased and know enough to be able to make informed > comparisons aren't likely to consider any language or paradigm the > "one true way". But not many people listen to advocates who don't claim > that their solution is perfect. [...] I'm surprised. I've had the impression (with little contact with the unwashed Java crowd, admittedly ;-) that being pigeonholed as "religious" is the major risk in evangelizing any programming language. John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 12:05 ` Ville-Pertti Keinonen ` (2 preceding siblings ...) 2003-11-18 18:00 ` John J Lee @ 2003-11-18 22:28 ` Brian Hurt 2003-11-18 23:07 ` John J Lee ` (4 more replies) 3 siblings, 5 replies; 92+ messages in thread From: Brian Hurt @ 2003-11-18 22:28 UTC (permalink / raw) To: Ville-Pertti Keinonen; +Cc: Caml Mailing List Wandering into language advocacy here. On Tue, 18 Nov 2003, Ville-Pertti Keinonen wrote: > On Mon, Nov 17, 2003 at 03:20:36PM -0600, Brian Hurt wrote: > > > into C++). And Java is the only language whose memory management is more > > advanced than 1968-era LISP. > > Did you forget to include the word "mainstream"? Yes I did. For example, Ocaml has good memory management. Note that Java's memory management is still only "excusable", not good. The common case of an allocation in Ocaml is 5 simple instructions. In Ocaml, it's simply not usefull to optimize allocation (for example, keeping a pool of pre-allocated objects around), as native allocation is simply too fast. You don't gain, and it's way easy to lose. Last time I checked, Java still had a problem with allocation being slow, causing Java programmers to work around it. > > > I want a copy. But I don't know how close to mainstream it is. Perl, > > Python, and Ruby are scripting languages, still mainly used for short, > > single-person, throw-away projects. And they aren't that far from > > Python and Ruby are hardly scripting languages, even though they are > often used as such. I think they could be decent general purpose > programming languages except for a few unfortunate design decisions > (such as scoping rules). Have there been any large projects (multiple developers, tens to hundreds of thousands of lines of code) in Python or Ruby? One problem I have is that programming is going away from strict compile time type checking to run-time type checking. The problem with run-time type checking is that it only catches errors in the field. Static type checking is the most powerfull tool we've come up with to ensure correctness in programs. And no, unit tests are not a replacement for strict compile-time type checking. The problem is that the only type checking people are aware of is Pascal/Algol-68 type systems. Which require you to be able to circumvent. And so you end up spending most of your time circumventing the type system, which causes smart people to wonder why it's there in the first place. The industry being stuck in the summer of love *is* having negative effects. > > > C in it. Java succeeded because IBM, Sun, Oracle, and a number of other > > huge companies got behind it. > > Not just that, the OO hype is a huge factor. Faced with advocates who > claim that subclassing is all you need and other language features > are undesirable, it takes a while for inexperienced programmers - even > smart ones - to become disillusioned and take the time to learn > something different... The OO hype was what drove the adoption of C++. See, in C++, you can write straight old-style procedural/imperitive C, and then tell your boss "Of course it's object oriented- it's in C++, isn't it?" I've seen my share of procedural C++ in my time. > > It's difficult for programming languages to be judged on merit. People > who are reasonably unbiased and know enough to be able to make informed > comparisons aren't likely to consider any language or paradigm the > "one true way". But not many people listen to advocates who don't claim > that their solution is perfect. > > I'm fairly sure nobody on this list would claim that OCaml is above all > other languages for every possible purpose. I will go one farther, and name one use for which C is a better solution than Ocaml: writting an OS, device driver, or embedded code which does a lot of banging directly on hardware. > > However, does anyone consider OCaml the best existing language for a > particular use? Or just the most convenient implementation of the > features needed? For any large, complex, data structure & algorithm heavy application, Ocaml is the best language I know of. It's not the best possible language (I can think of a number of improvements to Ocaml I'd like to see), but it's better for that purpose than any other language I know. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 22:28 ` Brian Hurt @ 2003-11-18 23:07 ` John J Lee 2003-11-18 23:22 ` Benjamin Geer ` (3 subsequent siblings) 4 siblings, 0 replies; 92+ messages in thread From: John J Lee @ 2003-11-18 23:07 UTC (permalink / raw) To: Caml Mailing List On Tue, 18 Nov 2003, Brian Hurt wrote: [...] > Have there been any large projects (multiple developers, tens to hundreds > of thousands of lines of code) in Python or Ruby? Yes. Though I think it's a bad idea to measure in lines of code, given that the reported discrepancy of (lines of code per function point) is so large between languages like C++ and Java on the one hand, and Python or Ruby on the other. > One problem I have is that programming is going away from strict compile > time type checking to run-time type checking. The problem with run-time > type checking is that it only catches errors in the field. Static type > checking is the most powerfull tool we've come up with to ensure > correctness in programs. And no, unit tests are not a replacement for > strict compile-time type checking. Well, a sincere "thanks" for going through the whole loop of that argument without the need for getting anyone else involved ;-) (Except to add "NOT!" of course ;-) > The problem is that the only type checking people are aware of is > Pascal/Algol-68 type systems. Which require you to be able to circumvent. [...] Of course. John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 22:28 ` Brian Hurt 2003-11-18 23:07 ` John J Lee @ 2003-11-18 23:22 ` Benjamin Geer 2003-11-19 1:49 ` Martin Berger ` (2 subsequent siblings) 4 siblings, 0 replies; 92+ messages in thread From: Benjamin Geer @ 2003-11-18 23:22 UTC (permalink / raw) To: Brian Hurt; +Cc: Caml Mailing List Brian Hurt wrote: > Have there been any large projects (multiple developers, tens to hundreds > of thousands of lines of code) in Python or Ruby? The best-known one in Python is probably the Zope application server (http://www.zope.org). There's a FAQ about this: http://www.python.org/doc/faq/general.html#have-any-significant-projects-been-done-in-python Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 22:28 ` Brian Hurt 2003-11-18 23:07 ` John J Lee 2003-11-18 23:22 ` Benjamin Geer @ 2003-11-19 1:49 ` Martin Berger 2003-11-19 3:57 ` Dustin Sallings 2003-11-19 13:35 ` skaller 2003-11-19 13:00 ` skaller 2003-11-19 13:02 ` skaller 4 siblings, 2 replies; 92+ messages in thread From: Martin Berger @ 2003-11-19 1:49 UTC (permalink / raw) To: Caml Mailing List Brian Hurt wrote: > For any large, complex, data structure & algorithm heavy application, > Ocaml is the best language I know of. what i've always wondered about is the following: all the benchmarks i have seen make ocaml look very good, but they are all using trivial programs. has anyone hands-on experience with using ocaml for processing *large* data-sets? by large i mean at least 1 GB? i am slightly worried about garbage collection performance in this case. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:49 ` Martin Berger @ 2003-11-19 3:57 ` Dustin Sallings 2003-11-19 13:35 ` skaller 1 sibling, 0 replies; 92+ messages in thread From: Dustin Sallings @ 2003-11-19 3:57 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List On Nov 18, 2003, at 17:49, Martin Berger wrote: > Brian Hurt wrote: > >> For any large, complex, data structure & algorithm heavy application, >> Ocaml is the best language I know of. > > what i've always wondered about is the following: all the benchmarks > i have seen make ocaml look very good, but they are all using trivial > programs. has anyone hands-on experience with using ocaml for > processing > *large* data-sets? by large i mean at least 1 GB? i am slightly worried > about garbage collection performance in this case. Absolutely. That's what attracted me to ocaml. I wrote the same program in python (original implementation), eiffel, bigloo scheme, ocaml (and probably some others), and C. At the time, my benchmarks suggested bigloo and ocaml were neck-and-neck, just slightly slower than the C version. I was more comfortable with scheme, so I went with it. Turns out, the scheme one wasn't really faster with the large data sets I was running in production. I also had some reliability problems with exceptions. Now, I've got two daily log processing apps (pretty decently large data sets, well over a gig) that finish in about 20 minutes where all others would take multiple hours. Not very useful benchmarks, but for sure, it's faster than anything else I've used for these tasks. I've also ported ocaml and ocamlopt to SunOS 4.1.4 where I use it to process all of my mail daily. That's all the mail I've received since about 1998 and all the mail I've sent since about 1994. -- SPY My girlfriend asked me which one I like better. pub 1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net> | Key fingerprint = 87 02 57 08 02 D0 DA D6 C8 0F 3E 65 51 98 D8 BE L_______________________ I hope the answer won't upset her. ____________ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:49 ` Martin Berger 2003-11-19 3:57 ` Dustin Sallings @ 2003-11-19 13:35 ` skaller 1 sibling, 0 replies; 92+ messages in thread From: skaller @ 2003-11-19 13:35 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List On Wed, 2003-11-19 at 12:49, Martin Berger wrote: > Brian Hurt wrote: > > > For any large, complex, data structure & algorithm heavy application, > > Ocaml is the best language I know of. > > what i've always wondered about is the following: all the benchmarks > i have seen make ocaml look very good, but they are all using trivial > programs. has anyone hands-on experience with using ocaml for processing > *large* data-sets? by large i mean at least 1 GB? i am slightly worried > about garbage collection performance in this case. The person to ask about this is probably Markus Mottl, his project deals with large commercial data sets. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 22:28 ` Brian Hurt ` (2 preceding siblings ...) 2003-11-19 1:49 ` Martin Berger @ 2003-11-19 13:00 ` skaller 2003-11-19 13:02 ` skaller 4 siblings, 0 replies; 92+ messages in thread From: skaller @ 2003-11-19 13:00 UTC (permalink / raw) To: Brian Hurt; +Cc: Ville-Pertti Keinonen, Caml Mailing List On Wed, 2003-11-19 at 09:28, Brian Hurt wrote: > Wandering into language advocacy here. > The problem is that the only type checking people are aware of is > Pascal/Algol-68 type systems. Can you charactise them? Do you mean for example that all the types are generative (no algebraic typing) except for functions? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 22:28 ` Brian Hurt ` (3 preceding siblings ...) 2003-11-19 13:00 ` skaller @ 2003-11-19 13:02 ` skaller 2003-11-19 17:36 ` Brian Hurt 4 siblings, 1 reply; 92+ messages in thread From: skaller @ 2003-11-19 13:02 UTC (permalink / raw) To: Brian Hurt; +Cc: Ville-Pertti Keinonen, Caml Mailing List On Wed, 2003-11-19 at 09:28, Brian Hurt wrote: > Wandering into language advocacy here. > > For any large, complex, data structure & algorithm heavy application, > Ocaml is the best language I know of. It's not the best possible > language (I can think of a number of improvements to Ocaml I'd like to > see), but it's better for that purpose than any other language I know. What about Haskell? Doesn't lazy evaluation have a significant advantage? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 13:02 ` skaller @ 2003-11-19 17:36 ` Brian Hurt 2003-11-20 5:14 ` skaller 0 siblings, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-19 17:36 UTC (permalink / raw) To: skaller; +Cc: Ville-Pertti Keinonen, Caml Mailing List On 20 Nov 2003, skaller wrote: > On Wed, 2003-11-19 at 09:28, Brian Hurt wrote: > > Wandering into language advocacy here. > > > > > For any large, complex, data structure & algorithm heavy application, > > Ocaml is the best language I know of. It's not the best possible > > language (I can think of a number of improvements to Ocaml I'd like to > > see), but it's better for that purpose than any other language I know. > > What about Haskell? Doesn't lazy evaluation have a significant > advantage? > 1) Lazy evaluation comes with a performance penalty. 2) Lazy evaluation + imperitive programming == hard to track down bugs 3) While there are programs which will terminate with lazy evaluation that won't terminate with strict evalulation, such programs mainly appear only as counter examples. If I know that I'm in a strict evaluation language, I just don't do that ("Doctor- it hurts when I do this!" "Well, don't do that then!"). -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 17:36 ` Brian Hurt @ 2003-11-20 5:14 ` skaller 2003-11-20 7:37 ` David Brown 0 siblings, 1 reply; 92+ messages in thread From: skaller @ 2003-11-20 5:14 UTC (permalink / raw) To: Brian Hurt; +Cc: Ville-Pertti Keinonen, Caml Mailing List On Thu, 2003-11-20 at 04:36, Brian Hurt wrote: > On 20 Nov 2003, skaller wrote: > 3) While there are programs which will terminate with lazy evaluation that > won't terminate with strict evalulation, such programs mainly appear only > as counter examples. If I know that I'm in a strict evaluation language, > I just don't do that ("Doctor- it hurts when I do this!" "Well, don't do > that then!"). For me it is hard to say, since I don't use a lazy language, *but* there are times when I am thinking of 'streaming' things rather than building whole data structures in memory and transforming them in phases. Whilst you can do that in a strict language like Ocaml I would guess it is (at least a bit more) *automatic* in a lazy language like Haskell. I guess that would be a major productivity and performance boost -- the code is easier to write, and far less memory is required (since for example only a small local part of a list will exist at any time, the not yet needed part is not yet built, and the already used part is not reachable and thus deallocated). So I would not be so quick to discredit lazy evaluation as a bad performer, I guess considerable experience would be needed first to form a judgement. One indication I have is that Charity is lazy. I don't know if that is an arbitrary choice or necessary for a reasonable representation of coinductive data types. Anyone know anything aboy the connection between lazy evaluation and coinductive types? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-20 5:14 ` skaller @ 2003-11-20 7:37 ` David Brown 0 siblings, 0 replies; 92+ messages in thread From: David Brown @ 2003-11-20 7:37 UTC (permalink / raw) To: skaller; +Cc: Brian Hurt, Ville-Pertti Keinonen, Caml Mailing List On Thu, Nov 20, 2003 at 04:14:36PM +1100, skaller wrote: > Whilst you can do that in a strict language like > Ocaml I would guess it is (at least a bit more) > *automatic* in a lazy language like Haskell. > > I guess that would be a major productivity > and performance boost -- the code is easier > to write, and far less memory is required > (since for example only a small local part > of a list will exist at any time, the not > yet needed part is not yet built, and the > already used part is not reachable and > thus deallocated). My brief experience with Haskell: - The lazy evaluation is very helpful, precisely for what you describe. Code has the potential to be clearer, since interfaces are simpler. Time need not be specified. - I also found it is easy to create space leaks, and very hard to find them. Since flow is not specified directly in the language, I found it difficult to develop an intuition for liveness. - The lazy evaluation is a performance hit, but a fairly constant one. The good compilers tend to write strict code for small cases, but large portions of the code run a lazy fashion. This adds quite a bit of burden to the GC as well. Essentially, most data structures include closures for computing results. There is extra overhead to evaluate the closures, as well as the necessity to allocate/collect the data associated with them. I do periodically go back to Haskell, because there is just something neat feeling about it. But, I have not encountered a language that is so intuitive to program as Ocaml. Dave ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-17 21:20 ` Brian Hurt 2003-11-17 23:02 ` John J Lee 2003-11-18 12:05 ` Ville-Pertti Keinonen @ 2003-11-18 15:12 ` skaller 2003-11-18 16:49 ` Martin Berger 2003-11-18 22:23 ` Brian Hurt 2 siblings, 2 replies; 92+ messages in thread From: skaller @ 2003-11-18 15:12 UTC (permalink / raw) To: Brian Hurt; +Cc: Caml Mailing List On Tue, 2003-11-18 at 08:20, Brian Hurt wrote: > On 18 Nov 2003, skaller wrote: > > > But it is irrelevant, because programmers do not intend to write > > arbitrary code. They intend to write comprehensible code, whose > > correctness is therefore necessarily easy to prove. > > What they intend to do and what they do do are often completely different. > It is a rare programmer who *intentionally* adds bugs to his code. Yet > bugs still happen. Of course. But my point is that a lot of the arguments I have seen that 'we cannot do that because the general problem is NP-complete' are irrelevant. > But my point here is that determining the *exact* time that an object > becomes garbage can be arbitrarily complex. Most of the time, I agree, it > won't be. The point is though: when it matters, the onus is on the programmer to make sure the design is able to determine an 'exact enough' time, using whatever tools the language may provide. In this case the problem is *necessarily* solvable (the assumption being the programmer checked that it could be done before starting). So we don't need something that determines when *arbitrary* objects can't be refered to synchronously, we need something that will do that job only in cases that are simple enough to base a design on it. An obvious candidate is a pure stack protocol (automatic store) and another is a hierarchical system (ref count). When there are circularities .. due to the complexity it would be unwise to count on synchronous finalisation, since it isn't clear when that is (by definition it is rather hard to determine ..) > Once you allow for the fact that finalizers/destructors may not happen in > a defined order or at defined times, why not go whole-hog? Hard question. One job a language is often required to do is emulate the behaviour of a program written in another language .. that's not always easy to do when the object, type, allocation or execution models are different. > In the most intractible cases, neither the compiler nor the programmer may > be able to determine when the last reference is released. But even in the > simple cases, the programmer may have the intent to release the resources > and simply doesn't. It's called a bug. I agree. Bugs happen. Perhaps I can give an example: in Java, variables must be initialised. Unlike Ocaml though, they do not have to be initialised at the point of declaration. Java requires instead 'manifest initialisation': its a compromise between initialisation at the point of declaration like Ocaml, or arbitrary initialisation (onus on programmer) as in C++. The language picked a constraint that is easy to check for both the compiler and for humans which is more flexible than the Ocaml model, but still assures freedom from unitialised variable errors. It still can't handle complex cases, which can be done in C++, but they're rarely needed because if they seem to be needed there is a good chance the design is flawed. > > > > However, two things spring to mind. First: if close is a primitive > > operation, we need a high level way of not only tracking > > what to close, but also of doing that tracking efficiently. > > > > Secondly, since the resource is represented in memory, > > and much of the time the dependencies which are *not* > > represented in memory could be, using the code > > of the garbage collector (and not just the same algorithm), > > makes some sense. > > Why special-case close? I wan't intending to, just picked an example to talk about. > It's what we've been talking about, but it could > just as easily be "release a mutex", etc. Agree. > > You are making an incorrect assumption here. You're assuming > > that finalisers only release resources. Consider again > > the case where the final action in generating a document > > is to create a table of contents (know nnow that all chapters > > are seen). > I wouldn't put that into a destructor. DON'T USE DESTRUCTORS FOR > ENFORCING THE ORDER OF OPERATIONS. Normally I wouldn't. But here there was no choice. You have to understand first that Interscript is a library with a driver program that looks like this: read a line if at end of file, delete user space else if the line starts with @ execute it otherwise if we're in tangle mode write to current source file otheriwse we have to be in document mode write to current document repeat Now, the user code looks like this: @h = tangler('src/x.ml') @select(h) print_endline "I'm an ocaml program" <EOF> Do you see the problem? The open file lives in USER space. The language syntax does not require the user close open files, because that would leave a dangling reference. But the engine has no idea which things are files that need closing, and which are documents that need tables of contents. The ONLY way to finalise each object correctly is in the destructor method (__del__ method). I could add a 'finalise' method to each object and call it but that would not help -- the __del__ method is precisely that anyhow, and it gets invoked automatically so it relieves me of the task of finding all the objects. You can see that the problem arises from a design decision not to require the USER to close files. > If it matters when the buffers are flushed, manually flush the buffers. See above. It isn't always possible: the program doesn't always know which files are open. This is quite typical in object oriented programs -- the program doesn't know what kinds of objects exist, there is no master that can call the 'finalise' method. In C++ code, particularly GUI code, callbacks often lead to suicide 'delete this' simply because there is no one around to delete an object. > > > > On thing is for sure .. I *hate* seeing > > "program terminated with uncaught "Not_found" exception" > > because I do hundreds of Hashtbl.find calls.... > There is a religous debate between returning 'a and throwing an exception > if the item isn't found, or returning 'a option. I don't think it is religious. There is a genuine problem here. I do not know how to state it exactly, but I'll try. Checking error codes and poping the stack is not only tedious and obscures the normal control flow, it does so to such an extent in certain cases as to be totally and unequivocably out of the question. For example in math calculations you simply cannot afford to check every single function call either before or after invocation: x = (a + b/c ^ d) for example would turn into many lines of spagetti. I will say: "the error detection is too localised" for this kind of code. On the other hand the uncaught exception of dynamic EH clearly shows that that mechanism leads to "the error detection is too global" What alternatives are there? One is to have exception specifications on functions, but that is known not to work very well. The first difficulty is that once you have them, they must become part of the type system. They then 'snowball' through the whole program (in the same way 'const' does to C programs). It isn't possible to deduce what exceptions can be thrown when functions are passed as arguments to functions, so this pollution of the type system would also be manifest in code in the form of annotations .. basically higher order functions would be screwed completely by this. So exception specifications are out for 'engineering' reasons. Another alternative is static exception handling. I tried that in Felix. It is very good for a wide class of problems. Static EH implies block structured code with the handler visible from the throw point... its really a structured goto. This solves many cases where we really wanted 'alternate control flow constructs' rather than error handling, but not all. And it doesn't work where a more dynamic transmission of error notifications is required. What else? Continuations? Can monads be used? We really do need a mechanism with better locality (easily control scope). ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 15:12 ` skaller @ 2003-11-18 16:49 ` Martin Berger 2003-11-18 17:46 ` skaller 2003-11-18 18:26 ` Benjamin Geer 2003-11-18 22:23 ` Brian Hurt 1 sibling, 2 replies; 92+ messages in thread From: Martin Berger @ 2003-11-18 16:49 UTC (permalink / raw) To: skaller; +Cc: Caml Mailing List > What alternatives are there? > > One is to have exception specifications on functions, > but that is known not to work very well. The first > difficulty is that once you have them, they must > become part of the type system. They then 'snowball' > through the whole program (in the same way 'const' > does to C programs). but isn't this snowballing exactly what you want? you can think of exceptions and normal function returns as well- behaved value-passing gotos. but nobody wants to ignore intermediate types in function chaining. so why should only the functional but not the exception behaviour be constraint by types? the only difference between exceptions and function types is that * for exceptions, the normal case is that we ignore the exception, i.e., all we do is pass it on, without touching it. * for functions, the normal case is to take the returned value and do something with it. i always wonder if problem would simply disappear with more expressive typing systems that allow concise specification of the normal case for exceptions -- where an piece of code is just a conduit for exceptions -- and appropriate grouping of exceptions, for example by subtyping. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 16:49 ` Martin Berger @ 2003-11-18 17:46 ` skaller 2003-11-19 1:33 ` Martin Berger 2003-11-18 18:26 ` Benjamin Geer 1 sibling, 1 reply; 92+ messages in thread From: skaller @ 2003-11-18 17:46 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List On Wed, 2003-11-19 at 03:49, Martin Berger wrote: > but isn't this snowballing exactly what you want? No. The problem is it breaks abstraction. In C++, the problem was for a templated higher order function .. which is common if you remember classes have methods .. there is no way to tell what exceptions a user defined function passed as an argument might throw. You cannot simply use a type constraint because it amounts to a restriction on the implementation. For example consider map f list which currently has signature ('a -> 'b) -> 'a list -> 'b list Well, how do you account for a function such as: let f x = 1 divide x which might throw division by zero? The only real solution is to not use exception specifications at all in higher order functions... which makes them pretty useless in a language like C++ or Ocaml. > you can > think of exceptions and normal function returns as well- > behaved value-passing gotos. but nobody wants to ignore > intermediate types in function chaining. so why should > only the functional but not the exception behaviour be > constraint by types? the only difference between exceptions > and function types is that > > * for exceptions, the normal case is that we ignore the exception, > i.e., all we do is pass it on, without touching it. > > * for functions, the normal case is to take the returned value > and do something with it. The problem is that exceptions thrown are typically implementation details so it would often be an error to include the exception type in the function signature. In my own code (Felix) I am systematically changing which exceptions report errors -- initially I just called Failure. Now I call exceptions that report source code locations... but this isn't really a change in the type of the function throwing such an error. > i always wonder if problem would simply disappear with more > expressive typing systems that allow concise specification > of the normal case for exceptions -- where an piece of code is > just a conduit for exceptions -- and appropriate grouping of > exceptions, for example by subtyping. Well, exceptions are 'really' wrong: they're 'really' a constraint on the type of the argument, for example divide: float -> float not zero -> float but expressed negatively (throws divide by zero). We can actually do this now, sort of, using classes: class float_not_zero x = if x <> 0 then v := x else raise Invalid_argument sort of thing. However, it is expensive (the best way to test if a matrix is singular is to invert it .. so what constraint can the inversion function have?) and it is generally TOO restrictive to want to transmit through the type system algebraically. Normally, if you think it is important enough you'd use a class to create a new type. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 17:46 ` skaller @ 2003-11-19 1:33 ` Martin Berger 2003-11-19 3:19 ` Design by Contract, was " Brian Hurt ` (2 more replies) 0 siblings, 3 replies; 92+ messages in thread From: Martin Berger @ 2003-11-19 1:33 UTC (permalink / raw) To: skaller; +Cc: Caml Mailing List > No. The problem is it breaks abstraction. i disagree. the exceptions thrown are part of the specification the function tries to meet. > For example consider > > map f list > > which currently has signature > > ('a -> 'b) -> 'a list -> 'b list i dont see a problem here. simply use exception specification polymorphism: map has type forall A B E: (A -> B throws E) -> list[A] -> list[B] throws E. while i have not thought about this in detail, i dont think there's a type theorectic problem with this. > The problem is that exceptions thrown are typically > implementation details so it would often be an error > to include the exception type in the function signature. i do not think that the exception thrown is an implementation detail > Well, exceptions are 'really' wrong: they're 'really' a constraint > on the type of the argument, for example > > divide: float -> float not zero -> float > > but expressed negatively (throws divide by zero). that's one way of looking at it. another would be to say we have dependent types ... unfortunatly neither rich specifications nor type dependencies lead to decidable type inference so we need to be less precise. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Design by Contract, was Re: [Caml-list] GC and file descriptors 2003-11-19 1:33 ` Martin Berger @ 2003-11-19 3:19 ` Brian Hurt 2003-11-19 2:57 ` Jacques Carette 2003-11-19 13:27 ` skaller 2003-11-19 16:54 ` Richard Jones 2 siblings, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-19 3:19 UTC (permalink / raw) To: Martin Berger; +Cc: skaller, Caml Mailing List On Wed, 19 Nov 2003, Martin Berger wrote: > > Well, exceptions are 'really' wrong: they're 'really' a constraint > > on the type of the argument, for example > > > > divide: float -> float not zero -> float > > > > but expressed negatively (throws divide by zero). > > that's one way of looking at it. another would be to say > we have dependent types ... unfortunatly neither rich > specifications nor type dependencies lead to decidable > type inference so we need to be less precise. > > martin > This actually brings to mind another way to improve Ocaml: Contracts, ala eiffle. The problem in the above example is that the constraint that the second argument not be zero is a contract. A classic example of a contract is Array.get, which requires the index to be >= 0 and < the length of the array. Being able to hoist this check out of Array.get can lead to non-trivial optimization opportunities. For example, consider the following code: for i = 0 to n do a.(i) <- 0 done This gets compiled like: for i = 0 to n do if i < Array.length a then a.(i) <- 0 else raise Invalid_argument "Array.get" done Strength reduction can then be applied to eliminate the redundant checks: let limit = min n ((Array.length a) - 1) in for i = 0 to limit do a.(i) <- 0 (* no check needed! *) done; if n >= (Array.length a) then raise Invalid_argument "Array.get" else () With arrays, you could simply declare them part of the language that the compiler knows about. But I'd like a more general approach. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Design by Contract, was Re: [Caml-list] GC and file descriptors 2003-11-19 3:19 ` Design by Contract, was " Brian Hurt @ 2003-11-19 2:57 ` Jacques Carette 0 siblings, 0 replies; 92+ messages in thread From: Jacques Carette @ 2003-11-19 2:57 UTC (permalink / raw) To: Brian Hurt, Martin Berger; +Cc: skaller, Caml Mailing List About contracts and dependent types: of course full dependent types are undecidable, but those are rarely needed? On the other hand, linear constraints over the integers are often needed, AND are not only fully decidable, there are both nice complexity results as well as practical algorithms [not both together :-( yet]. What puzzles me is that some decidable subset of dependent types is not part of any 'real' programming language (like Ocaml or Haskell). Certainly the Array code just posted is a nice example where having dependent types with only linear constraints over the integers as the 'extra' power is enough to resolve everything. In fact, it seems that almost all examples of dependent types that I have seen in the type theory litterature are of this kind; counter-examples typically use polynomial constraints, which are undecidable in general. Or am I missing something obvious that makes this much harder than it seems? Jacques ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:33 ` Martin Berger 2003-11-19 3:19 ` Design by Contract, was " Brian Hurt @ 2003-11-19 13:27 ` skaller 2003-11-19 14:41 ` Martin Berger 2003-11-19 16:54 ` Richard Jones 2 siblings, 1 reply; 92+ messages in thread From: skaller @ 2003-11-19 13:27 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List On Wed, 2003-11-19 at 12:33, Martin Berger wrote: > i dont see a problem here. simply use exception specification > polymorphism: map has type > > forall A B E: (A -> B throws E) -> list[A] -> list[B] throws E. OK, I'll have to think about that, I've not seen it before. > i do not think that the exception thrown is an implementation > detail The problem is that in practice it very often is. In Felix, for example. I throw ClientError of location * string a lot. The choice is arbitrary: its used for a type error, a constraint violation, and even ill formed syntax. I also throw ClientError2 of location * location * string sometimes when I want to indicate two places in the source (such as when there is a duplicate definition). I think that because excpetions are basically a generative kind, they really are an implementation detail, even if the constraint being violated is not. To put this another way, exceptions do NOT reflect the error, they reflect the style of reporting it. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 13:27 ` skaller @ 2003-11-19 14:41 ` Martin Berger 0 siblings, 0 replies; 92+ messages in thread From: Martin Berger @ 2003-11-19 14:41 UTC (permalink / raw) To: skaller; +Cc: Caml Mailing List skaller wrote: > To put this another way, exceptions do NOT > reflect the error, they reflect the style > of reporting it. yes, but does the style of reporting have to be exposed at the type level? if yes, then it probably is not just an implementation issue. in my experience, getting error reporting/logging/program-self-monitoring right for non-toy programs is a hard problem and has serious ramifications throughout the whole design of the program, much like concurrency and memory management. the fact that we have a complicated mechanism (exceptions) to deal with this suggests that it should never be an afterthough and relegated to mere implementation details. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:33 ` Martin Berger 2003-11-19 3:19 ` Design by Contract, was " Brian Hurt 2003-11-19 13:27 ` skaller @ 2003-11-19 16:54 ` Richard Jones 2003-11-19 17:18 ` Damien Doligez 2003-11-19 18:03 ` Martin Berger 2 siblings, 2 replies; 92+ messages in thread From: Richard Jones @ 2003-11-19 16:54 UTC (permalink / raw) Cc: Caml Mailing List On Wed, Nov 19, 2003 at 01:33:22AM +0000, Martin Berger wrote: > forall A B E: (A -> B throws E) -> list[A] -> list[B] throws E. > > while i have not thought about this in detail, i dont think there's > a type theorectic problem with this. Yes, all well and good, but I *do not* want have to go and change all .mli files to support checked exceptions, and then go and change them all again when I decide to put a SQL database behind some persistence library deep in the code. This is the problem with checked exceptions in Java: the set of exceptions that can be thrown is an implementation detail which is exposed unnecessarily through the API. Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment 'There is a joke about American engineers and French engineers. The American team brings a prototype to the French team. The French team's response is: "Well, it works fine in practice; but how will it hold up in theory?"' ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 16:54 ` Richard Jones @ 2003-11-19 17:18 ` Damien Doligez 2003-11-19 21:45 ` Richard Jones 2003-11-19 18:03 ` Martin Berger 1 sibling, 1 reply; 92+ messages in thread From: Damien Doligez @ 2003-11-19 17:18 UTC (permalink / raw) To: Caml Mailing List On Wednesday, November 19, 2003, at 05:54 PM, Richard Jones wrote: > Yes, all well and good, but I *do not* want have to go and change all > .mli files to support checked exceptions, Big change in the language -> big changes in the programs. Fair enough. > and then go and change them > all again when I decide to put a SQL database behind some persistence > library deep in the code. Why would you need to do that ? Your new implementation of the persistence library should use a few try...with constructs instead of changing the interface of the functions in an incompatible way. > This is the problem with checked exceptions in Java: the set of > exceptions that can be thrown is an implementation detail which is > exposed unnecessarily through the API. IMO it is part of the interface, just like the return type of the functions. -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 17:18 ` Damien Doligez @ 2003-11-19 21:45 ` Richard Jones 2003-11-19 23:09 ` Benjamin Geer 0 siblings, 1 reply; 92+ messages in thread From: Richard Jones @ 2003-11-19 21:45 UTC (permalink / raw) Cc: Caml Mailing List > >This is the problem with checked exceptions in Java: the set of > >exceptions that can be thrown is an implementation detail which is > >exposed unnecessarily through the API. > > IMO it is part of the interface, just like the return type of the > functions. I think in academia you can say these things. But on the sprawling real projects, badly managed, written by poorly skilled programmers, checked exceptions are a really bad idea. (Trust me on this one, I've worked on several such projects). http://www.mindview.net/Etc/Discussions/CheckedExceptions I don't agree with his point of view on strong typing .. but then he's coming from a Java background, so what do you expect? HOWEVER, if I don't have to write .mli files (ie. if I don't have to tediously define what all my functions throw), then guess what: I think checked exceptions, infered automatically by the compiler, could actually be a really GOOD idea. But it looks like this would require a major change to the language - ie. getting rid of .mli files altogether and adding the 'public' / 'abstract' keywords to the .ml files as described by, I think, Brian Hurt in another thread. Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment "I wish more software used text based configuration files!" -- A Windows NT user, quoted on Slashdot. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 21:45 ` Richard Jones @ 2003-11-19 23:09 ` Benjamin Geer 2003-11-20 0:50 ` Nicolas Cannasse 0 siblings, 1 reply; 92+ messages in thread From: Benjamin Geer @ 2003-11-19 23:09 UTC (permalink / raw) To: Caml Mailing List; +Cc: rich Richard Jones wrote: >>>This is the problem with checked exceptions in Java: the set of >>>exceptions that can be thrown is an implementation detail which is >>>exposed unnecessarily through the API. >> >>IMO it is part of the interface, just like the return type of the >>functions. > > I think in academia you can say these things. But on the sprawling > real projects, badly managed, written by poorly skilled programmers, > checked exceptions are a really bad idea. (Trust me on this one, I've > worked on several such projects). I design moderately large Java projects for a living, and I think checked exceptions are the only thing that forces poorly skilled and indifferent programmers to do any error handling at all. When a designer establishes a sensible policy regarding exceptions, checked exceptions make it possible to be reasonably confident that (nearly) all error conditions are handled in a sane manner. On a large project, this is a precious advantage. There's nothing worse than using a library without having any idea what exceptions it might throw. It's like playing Russian roulette. Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 23:09 ` Benjamin Geer @ 2003-11-20 0:50 ` Nicolas Cannasse 2003-11-20 9:42 ` Benjamin Geer 0 siblings, 1 reply; 92+ messages in thread From: Nicolas Cannasse @ 2003-11-20 0:50 UTC (permalink / raw) To: Benjamin Geer, Caml Mailing List; +Cc: rich > >>>This is the problem with checked exceptions in Java: the set of > >>>exceptions that can be thrown is an implementation detail which is > >>>exposed unnecessarily through the API. > >> > >>IMO it is part of the interface, just like the return type of the > >>functions. > > > > I think in academia you can say these things. But on the sprawling > > real projects, badly managed, written by poorly skilled programmers, > > checked exceptions are a really bad idea. (Trust me on this one, I've > > worked on several such projects). > > I design moderately large Java projects for a living, and I think > checked exceptions are the only thing that forces poorly skilled and > indifferent programmers to do any error handling at all. When a > designer establishes a sensible policy regarding exceptions, checked > exceptions make it possible to be reasonably confident that (nearly) all > error conditions are handled in a sane manner. On a large project, this > is a precious advantage. > > There's nothing worse than using a library without having any idea what > exceptions it might throw. It's like playing Russian roulette. It can work in an opposite way. I've seen developpers (professional) that couldn't break the specification by adding an throws statement so they were simply catching .... and ignoring the exceptions ! There's nothing worse than using a library without having any idea what error happened inside it . It's like playing Seattle roulette :-) Nicolas Cannasse ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-20 0:50 ` Nicolas Cannasse @ 2003-11-20 9:42 ` Benjamin Geer 0 siblings, 0 replies; 92+ messages in thread From: Benjamin Geer @ 2003-11-20 9:42 UTC (permalink / raw) To: Nicolas Cannasse; +Cc: Caml Mailing List, rich Nicolas Cannasse wrote: >>When a >>designer establishes a sensible policy regarding exceptions, checked >>exceptions make it possible to be reasonably confident that (nearly) all >>error conditions are handled in a sane manner. > > It can work in an opposite way. > I've seen developpers (professional) that couldn't break the specification > by adding an throws statement so they were simply catching .... and ignoring > the exceptions ! Note what I said above about the designer establishing 'a sensible policy'. In my team, catching and ignoring exceptions is punishable by the death penalty. :) If an interface doesn't allow you to handle an exception properly, it needs to be changed. Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 16:54 ` Richard Jones 2003-11-19 17:18 ` Damien Doligez @ 2003-11-19 18:03 ` Martin Berger 1 sibling, 0 replies; 92+ messages in thread From: Martin Berger @ 2003-11-19 18:03 UTC (permalink / raw) To: Caml Mailing List Richard Jones wrote: >> forall A B E: (A -> B throws E) -> list[A] -> list[B] throws E. > Yes, all well and good, but I *do not* want have to go and change all > .mli files to support checked exceptions, and then go and change them > all again when I decide to put a SQL database behind some persistence > library deep in the code. for a start, the universal quantification above (for a map function) deals with this just fine. it can handle whatevery you throw. i do think however, that there is a problem with declaring the exceptions a function can throw. i think what the type should really declare is something like the exceptions the synax of the function body *adds*. eg a function let f n = let g = send_to_network n in let h = write_do_DB g in if n > 17 then throw Micky_Mouse else 666 should be typable as f : nat -> nat adds Micky_Mouse and not *just* as f : nat -> nat throws E, Micky_Mouse where E is whatever send_to_network and write_to_DB throw. one compromise may be to add an explicit exception spec grabbing operator so we can write f : nat -> nat throws ex(send_to_network), ex(write_to_DB), Micky_Mouse or f : nat -> nat adds ex(send_to_network), ex(write_to_DB), Micky_Mouse where in the latter case ex( f ) gives all the exceptions f adds while in the former case it returns all exceptions f throws. note that at link time, when we put all modules together, the accumulated "adds" information determines the usual "throws" information. of course "adds" and "throws" can live harmoniously together. > This is the problem with checked exceptions in Java: the set of > exceptions that can be thrown is an implementation detail which is > exposed unnecessarily through the API. again, i don't think this is an implementation detail. if you are a library vendor, i as a customer want to know if your library throws SQL exceptions or not. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 16:49 ` Martin Berger 2003-11-18 17:46 ` skaller @ 2003-11-18 18:26 ` Benjamin Geer 2003-11-18 19:24 ` Xavier Leroy 2003-11-19 1:33 ` Martin Berger 1 sibling, 2 replies; 92+ messages in thread From: Benjamin Geer @ 2003-11-18 18:26 UTC (permalink / raw) To: Caml Mailing List Martin Berger wrote: >> What alternatives are there? >> One is to have exception specifications on functions, >> but that is known not to work very well. [...] > > but isn't this snowballing exactly what you want? I think it is. It's very reassuring to know that the compiler can tell me whether I've left any exceptions unhandled, just as it can tell me whether I've neglected to provide a suitable return value for a function. From experience working on fairly large programs in Java, I can say (at the risk of being pelted with stones on this list) that I think the way Java handles this works pretty well. You can avoid having any methods specify more than two or three exceptions by using hierarchies of exception subtypes (e.g. IOException has subtypes FileNotFoundException, SocketException and so on) and by using nested exception objects (e.g. a FooSubsystemException can contain an instance of any other exception, and can thus be handled by a method that only specifies FooSubsystemException). Nested exceptions have the useful property that when you get a stack trace from an exception (e.g. in order to log it), it recursively includes the stack traces of any nested exceptions. In Caml, as in C++, I'm left with a lingering anxiety about what exceptions might be thrown (particularly by libraries, including the standard libraries) but not handled except by a catch-all 'unhandled exception handler', at which point it's too late to do anything useful with them. (And Caml exceptions lack stack traces.) Annoying problems arise in Java with unchecked exceptions; things like IndexOutOfBoundsException (which can be thrown by any array access) or ArithmeticException (e.g. division by zero) don't have to be declared in exception specifications, and therefore never are. Bugs often result in programs crashing with an unhandled NullPointerException (which of course can't happen in Caml). Ideally, the number of possible unchecked exceptions should be kept to an absolute minimum; I think there are too many in Java. I wish I knew what the ideal solution was, but I think Caml could do worse than to implement a Java-like approach. It seems to me that this would be more consistent with Caml's overall focus on type safety than its current C++-like approach. > i always wonder if problem would simply disappear with more > expressive typing systems that allow concise specification > of the normal case for exceptions -- where an piece of code is > just a conduit for exceptions -- and appropriate grouping of > exceptions, for example by subtyping. If the type of a function included its exception specification, could Caml infer exception specifications? If so, perhaps exception specifications could be added to the language without breaking backwards compatibility. If I wrote this: let divide x y = x / y ;; let do_work x y = divide x y ;; the type of both functions would be inferred as having an exception specification containing Division_by_zero. Now suppose I wrote the following (meaning that the function do_work explicitly specifies the exception Sys_error): let do_work x [ Sys_error ] = let z = (* ... *) in divide x z ;; I would get a compile error, because I should have written: let do_work x [ Sys_error; Division_by_zero ] = let z = (* ... *) in divide x z ;; When using libraries that were written before the introduction of exception specifications, I could verify that all library exceptions were handled, by calling a library function in the following way: let do_work x [] = (* Call some library functions that don't have explicit exception specifications *) ;; The compiler would then tell me which exceptions I'd failed to handle. Does this seem feasible? Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 18:26 ` Benjamin Geer @ 2003-11-18 19:24 ` Xavier Leroy 2003-11-18 23:49 ` Benjamin Geer 2003-11-19 1:36 ` Martin Berger 2003-11-19 1:33 ` Martin Berger 1 sibling, 2 replies; 92+ messages in thread From: Xavier Leroy @ 2003-11-18 19:24 UTC (permalink / raw) To: Benjamin Geer; +Cc: Caml Mailing List > If the type of a function included its exception specification, > could Caml infer exception specifications? Yes, with the proviso that you need a fairly sophisticated exception analysis to get enough precision in practice. See for instance the PhD work of my former student, François Pessaux: François Pessaux and Xavier Leroy. Type-based analysis of uncaught exceptions. ACM Transactions on Programming Languages and Systems, 22(2):340-377, 2000. http://pauillac.inria.fr/~xleroy/publi/exceptions-toplas.ps.gz - Xavier Leroy ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 19:24 ` Xavier Leroy @ 2003-11-18 23:49 ` Benjamin Geer 2003-11-19 1:36 ` Martin Berger 1 sibling, 0 replies; 92+ messages in thread From: Benjamin Geer @ 2003-11-18 23:49 UTC (permalink / raw) To: Xavier Leroy; +Cc: Caml Mailing List Xavier Leroy wrote: >>If the type of a function included its exception specification, >>could Caml infer exception specifications? > > Yes, with the proviso that you need a fairly sophisticated exception > analysis to get enough precision in practice. See for instance the > PhD work of my former student, François Pessaux: > > François Pessaux and Xavier Leroy. Type-based analysis of uncaught > exceptions. ACM Transactions on Programming Languages and Systems, > 22(2):340-377, 2000. > http://pauillac.inria.fr/~xleroy/publi/exceptions-toplas.ps.gz I've just read this paper, and it looks like very promising work. I wholeheartedly agree with the presentation of the issues in the Introduction, which makes two very important points: (1) 'Our experience with large ML applications is that uncaught exceptions are the most frequent mode of failure.' (2) 'Declaring escaping exceptions in functions and method signatures works well in first-order, monomorphic programs, but is not adequate for the kind of higher-order, polymorphic programming that ML promotes.' (As the article points out, this problem comes up in Java as well; in an implementation of the Command pattern, it's difficult not to define the execute() method of the Command interface as being able to throw any exception.) The paper makes a convincing case for inference as a better approach. Has the work described in this paper been continued? Are there any plans to integrate it, or something like it, into Caml? Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 19:24 ` Xavier Leroy 2003-11-18 23:49 ` Benjamin Geer @ 2003-11-19 1:36 ` Martin Berger 2003-11-19 2:28 ` Nicolas Cannasse ` (2 more replies) 1 sibling, 3 replies; 92+ messages in thread From: Martin Berger @ 2003-11-19 1:36 UTC (permalink / raw) To: Caml Mailing List > François Pessaux and Xavier Leroy. Type-based analysis of uncaught > exceptions. ACM Transactions on Programming Languages and Systems, > 22(2):340-377, 2000. > http://pauillac.inria.fr/~xleroy/publi/exceptions-toplas.ps.gz i havn't had time to give that paper the proper read it deserves, but the following caught my eye: "to deal properly with higher-order functions, a very rich language for exception declarations is required, including at least exception polymorphism (variables ranging over sets of exceptions) and unions of exception sets [...]. we believe that such a complex language for declaring escaping exceptions is beyond what programmers are willing to tolerate." i'd be happy to have such a language available. anyway, union of exception sets is already present in java, the only novelty would be to abstract over exception sets. but that isn't really formally very different from normal existential or universal quantification. in addition, i assume the programmer does not have to specify exception sets because they can be inferred (in most cases?). in that, exception sets should not be different from the other typing information. in practise, the programmer would use explicit exception specifications only at module boundaries. one of the key problems with exceptions specifications is of course that a single change somewhere in a program may trigger heaps of other code becoming untypable. i can imagine that a simple compiler switch for turning off exception specification checking during development would take away much of the pain here. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:36 ` Martin Berger @ 2003-11-19 2:28 ` Nicolas Cannasse 2003-11-19 3:26 ` Brian Hurt 2003-11-19 13:33 ` skaller 2 siblings, 0 replies; 92+ messages in thread From: Nicolas Cannasse @ 2003-11-19 2:28 UTC (permalink / raw) To: Martin Berger, Caml Mailing List > > François Pessaux and Xavier Leroy. Type-based analysis of uncaught > > exceptions. ACM Transactions on Programming Languages and Systems, > > 22(2):340-377, 2000. > > http://pauillac.inria.fr/~xleroy/publi/exceptions-toplas.ps.gz > > i havn't had time to give that paper the proper read it deserves, > but the following caught my eye: [...] I haven't read the paper either, but I was wondering that maybe static exceptions inference would also permit exception polymorphism... Just my two cents... Nicolas Cannasse ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:36 ` Martin Berger 2003-11-19 2:28 ` Nicolas Cannasse @ 2003-11-19 3:26 ` Brian Hurt 2003-11-19 11:44 ` Martin Berger 2003-11-19 13:33 ` skaller 2 siblings, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-19 3:26 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List On Wed, 19 Nov 2003, Martin Berger wrote: > one of the key problems with exceptions specifications is of course > that a single change somewhere in a program may trigger heaps of > other code becoming untypable. i can imagine that a simple compiler > switch for turning off exception specification checking during > development would take away much of the pain here. The single change the programmer would have to make in this case is to add a new error case that is not being handled. In which case the compiler is being nice and telling you all the places where you need to think about how to handle this new error case. The biggest problems I see is that there are a number of places in Ocaml where the programmer still has to spell out the types of things- .mli files being the obvious example. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 3:26 ` Brian Hurt @ 2003-11-19 11:44 ` Martin Berger 2003-11-19 17:29 ` Brian Hurt 0 siblings, 1 reply; 92+ messages in thread From: Martin Berger @ 2003-11-19 11:44 UTC (permalink / raw) To: Brian Hurt; +Cc: Caml Mailing List Brian Hurt wrote: > The single change the programmer would have to make in this case is to add > a new error case that is not being handled. In which case the compiler is > being nice and telling you all the places where you need to think about > how to handle this new error case. this can be immensely useful, but also very infuriating, depending on where you are in the software development cycle. imagine having 100000 lines of code, mostly mature, and you are trying to track down a little bug. for that you want to see with what arguments the function let f m n = body;; is called. assume that function has the type f : int -> ( int -> A throws E ) throws E so for debugging you modify f to let f m n = print_debug "calling f with arguments " m n; body if print_debug may throw something not in E and if f is used all over your code, you will have make an enourmous of changes (and later revert them) just to get a silly little debugging mechanism going. i would hate having to do this. being able to switch off exception would be a great help in this situation. being able to switch on or off exception checking is just an instance of a more general phenomenon where you run different checks on your software independently of each other. i expect future compilers to be more flexible in this regard, maybe offering plug-in typing systems from untyped to fully fledged dependent types and proof annotations. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 11:44 ` Martin Berger @ 2003-11-19 17:29 ` Brian Hurt 2003-11-20 5:17 ` skaller 0 siblings, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-19 17:29 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List On Wed, 19 Nov 2003, Martin Berger wrote: > imagine having 100000 lines of code, mostly mature, and you are trying to > track down a little bug. for that you want to see with what arguments the > function > > let f m n = > body;; > > is called. assume that function has the type > > f : int -> ( int -> A throws E ) throws E > > so for debugging you modify f to > > let f m n = > print_debug "calling f with arguments " m n; > body > > if print_debug may throw something not in E and if f is used all over > your code, you will have make an enourmous of changes (and later revert > them) just to get a silly little debugging mechanism going. i would hate > having to do this. being able to switch off exception would be a great > help in this situation. If calling print_debug adds an error condition (i.e. can throw an exception), then you have two choices: 1) Fix print_debug so it doesn't throw an exception, 2) Do the following instead: let f m n = try print_debug "calling f with arguments " m n with _ -> (); body I'd recommend #1 myself. Debugging code should not have any effect on the program (otherwise, you are opening yourself up to heisenbugs, where the program works correctly with debugging turned on, and fails with debugging turned off). > > being able to switch on or off exception checking is just an instance of > a more general phenomenon where you run different checks on your software > independently of each other. i expect future compilers to be more flexible > in this regard, maybe offering plug-in typing systems from untyped to > fully fledged dependent types and proof annotations. > The problem with this is that then everyone immediately turns exception checking off and the value of the feature is greatly reduced (at best). -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 17:29 ` Brian Hurt @ 2003-11-20 5:17 ` skaller 2003-11-20 16:13 ` Brian Hurt 0 siblings, 1 reply; 92+ messages in thread From: skaller @ 2003-11-20 5:17 UTC (permalink / raw) To: Brian Hurt; +Cc: Martin Berger, Caml Mailing List On Thu, 2003-11-20 at 04:29, Brian Hurt wrote: > On Wed, 19 Nov 2003, Martin Berger wrote: > you are opening yourself up to heisenbugs, where the > program works correctly with debugging turned on, and fails with debugging > turned off). OH! Where did that wonderful name heisenbugs come from? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-20 5:17 ` skaller @ 2003-11-20 16:13 ` Brian Hurt 0 siblings, 0 replies; 92+ messages in thread From: Brian Hurt @ 2003-11-20 16:13 UTC (permalink / raw) To: skaller; +Cc: Martin Berger, Caml Mailing List On 20 Nov 2003, skaller wrote: > On Thu, 2003-11-20 at 04:29, Brian Hurt wrote: > > On Wed, 19 Nov 2003, Martin Berger wrote: > > > you are opening yourself up to heisenbugs, where the > > program works correctly with debugging turned on, and fails with debugging > > turned off). > > OH! Where did that wonderful name heisenbugs come from? > I got it from the Jargon file. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:36 ` Martin Berger 2003-11-19 2:28 ` Nicolas Cannasse 2003-11-19 3:26 ` Brian Hurt @ 2003-11-19 13:33 ` skaller 2003-11-19 17:01 ` Richard Jones 2003-11-19 17:43 ` [Caml-list] GC and file descriptors Brian Hurt 2 siblings, 2 replies; 92+ messages in thread From: skaller @ 2003-11-19 13:33 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List On Wed, 2003-11-19 at 12:36, Martin Berger wrote: > > François Pessaux and Xavier Leroy. Type-based analysis of uncaught > > exceptions. ACM Transactions on Programming Languages and Systems, > > 22(2):340-377, 2000. > > http://pauillac.inria.fr/~xleroy/publi/exceptions-toplas.ps.gz > > i havn't had time to give that paper the proper read it deserves, > but the following caught my eye: > > "to deal properly with higher-order functions, a very rich language > for exception declarations is required, including at least exception > polymorphism (variables ranging over sets of exceptions) and unions > of exception sets [...]. we believe that such a complex language for > declaring escaping exceptions is beyond what programmers are willing > to tolerate." > > i'd be happy to have such a language available. That is too weak. In a language like Ocaml with separate interfaces there is no choice. You cant infer the exceptions a function throws from its interface, only the body. So either you give up explicit interfaces (mli files, signatures, etc) or you have to have an extension to the type system terms which allows you to state the exception spec. A compromise might be: give up declared interfaces. However, allow constraints to be stated on the infered ones. I have no idea what that would look like though. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 13:33 ` skaller @ 2003-11-19 17:01 ` Richard Jones 2003-11-22 2:39 ` [Caml-list] AutoMLI (Was: GC and file descriptors) Jim 2003-11-19 17:43 ` [Caml-list] GC and file descriptors Brian Hurt 1 sibling, 1 reply; 92+ messages in thread From: Richard Jones @ 2003-11-19 17:01 UTC (permalink / raw) Cc: Caml Mailing List On Thu, Nov 20, 2003 at 12:33:34AM +1100, skaller wrote: > A compromise might be: give up declared interfaces. Couldn't .mli files be done away with mostly with a few extra keywords in the language, eg: public let my_function_which_i_want_exported = ... abstract type t = { hidden implementation } It's a bit of a flaw in the language that you have all this lovely type inference working for you, but then you have to go and declare your types anyway in the .mli files. Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment "My karma ran over your dogma" ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* [Caml-list] AutoMLI (Was: GC and file descriptors) 2003-11-19 17:01 ` Richard Jones @ 2003-11-22 2:39 ` Jim 0 siblings, 0 replies; 92+ messages in thread From: Jim @ 2003-11-22 2:39 UTC (permalink / raw) To: Caml Mailing List On Wed, Nov 19, 2003 at 05:01:39PM +0000, Richard Jones wrote: > It's a bit of a flaw in the language that you have all this lovely > type inference working for you, but then you have to go and declare > your types anyway in the .mli files. It occured to me that this should be fairly easy to fix using some kind of preprocessor, so this afternoon I had a go at throwing one together. http://draco.dyndns.org/~jim/files/automli-0.1.tar.gz Using this program, you write foo.ml, then create a foo.mla file containing something like: export none ; export val foo ; export type bar ; export type qux ; export typedef qux ; Then running automli foo >foo.mli will generate an interface containing the definition for function foo, the abstract type bar, and the concrete type qux, all taken automagically from foo.ml. If you prefer to say which things you DON'T want exported, instead of those you do, you can write a foo.mla file like this: export all ; hide typedef bar ; It should handle values, types, classes, class types, and exceptions. It also works for modules and module types, except there is no way to specify what from a module declaration should be exported. (i.e., the whole thing gets exported.) It also works for the revised syntax. Use automlir instead of automli. Note, this program is in the "very dodgy hack" category, and I haven't tested it in any real life situations. I knew nothing about camlp4 before today, (nor do I know much about it now ;) ) so I have no idea if what I have done is sane, except that it seems to work for me on my simple test data, using O'Caml 3.07. There are many bugs, I'm sure. For example, it doesn't do any kind of error or sanity checking. I'm going to play with this program in my own project, as I am often annoyed by the need to keep my mli files in sync with my ml files, and if it turns out to be useful I may develop it further. (Unless anyone else feels inspired to do it for me. :) ) Feedback welcome. What can be exported in a mli file that I have missed? There are definitely more declarations in MLast, but as it isn't documented, I don't know what they are. Can anyone tell me how I can get pr_r.cmo to send output to a file, instead of stdout? Regards, Jim ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 13:33 ` skaller 2003-11-19 17:01 ` Richard Jones @ 2003-11-19 17:43 ` Brian Hurt 2003-11-20 5:05 ` skaller 1 sibling, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-19 17:43 UTC (permalink / raw) To: skaller; +Cc: Martin Berger, Caml Mailing List On 20 Nov 2003, skaller wrote: > So either you give up explicit interfaces (mli files, > signatures, etc) or you have to have an extension > to the type system terms which allows you to state the > exception spec. I'd consider this a bonus :-). There are two obvious ways you could do this: 1) introduce a new keyword 'public'. Only types and values declared public are exported, everything else is considered internal to the file. So you could have: type public foo = This | That ;; (* exported *) type bar = Which | What ;; (* not exported *) let quux = ... ;; (* not exported *) let public baz = ... ;; (* exported *) 2) introduce a new keyword 'private'. Everything except those types and values marked private is exported. Note that we're breaking working code here. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 17:43 ` [Caml-list] GC and file descriptors Brian Hurt @ 2003-11-20 5:05 ` skaller 0 siblings, 0 replies; 92+ messages in thread From: skaller @ 2003-11-20 5:05 UTC (permalink / raw) To: Brian Hurt; +Cc: Martin Berger, Caml Mailing List On Thu, 2003-11-20 at 04:43, Brian Hurt wrote: > 2) introduce a new keyword 'private'. Everything except those types and > values marked private is exported. > > Note that we're breaking working code here. I do not see how, except that the keyword private is already used :-) I suggested previously a third keyword would be useful, 'abstract' which is used like: abstract type x = y; which generates the interface type x; I think this is called "limited private" in Ada? This does not break anything (apart from stealing some identifiers as keywords). The assumption is that there is only an .ml file, no .mli file: if you don't use private or abstract keyword, the result is the same as it is now. If you do, you're using an extension which is not well formed syntactically at present, and so no existing code can be broken. The only real difficulty here is: "what if there is an .mli file that disagrees with the .ml file?" This can already happen: a constraint on a function type, or a function that isn't in the interface at all. But what if a function is marked private in the ml file but exists in the .mli file, implying public? I guess the best answer is: its an error. [Same if a type is marked abstract] As a matter of interest it may be that the following is useful in an interface: abstract type x = int; and what that means is: the typing is type x; BUT the compiler may use the 'secret' knowledge it's actually an int for optimisation. Something similar to this already happens for classes: you have to give implementation details (data members of a class) in mli files: they're ignored by the typing I think, but needed for code generation The mechanism being discussed here may make it possible to write a large class of programs or libraries without bothering with .mli files, yet still specify the desired interface; avoiding undesirable decoupling. [NOTE: a decoupled interface is still possible and sometimes it is desirable .. just not always] I am trying this out in Felix, I've implemented 'private' and it seems to work as expected... and was immediately very useful (about 20% of all my library functions turned out to be private, and I also make synthesised names private because it might help catch some compiler bugs) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 18:26 ` Benjamin Geer 2003-11-18 19:24 ` Xavier Leroy @ 2003-11-19 1:33 ` Martin Berger 2003-11-19 2:47 ` Benjamin Geer 1 sibling, 1 reply; 92+ messages in thread From: Martin Berger @ 2003-11-19 1:33 UTC (permalink / raw) To: Benjamin Geer; +Cc: Caml Mailing List > From experience working on fairly large programs in Java, I can say (at > the risk of being pelted with stones on this list) that I think the way > Java handles this works pretty well. You can avoid having any methods > specify more than two or three exceptions by using hierarchies of > exception subtypes (e.g. IOException has subtypes FileNotFoundException, > SocketException and so on) and by using nested exception objects (e.g. a > FooSubsystemException can contain an instance of any other exception, > and can thus be handled by a method that only specifies > FooSubsystemException). my (limited) experience with java suggests that in large projects one of the following happens: * all exceptions specs are written out in detail (ie no grouping using subtyping etc). in this case, way too much code is nothing but exception specs. * the subtyping approach is used. in this case exception specifications are too imprecise; * something that seems like what you refer to as nested exceptions where you catch exceptions at every layer and convert them into some other exception. in this case you litter the code with catch statements that seem superflouos. in summary, i do not recommend the java approach (for other reasons too, like unchecked exceptions). i think exception specification polymorphism is cruical. maybe some other ideas are also needed. martin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-19 1:33 ` Martin Berger @ 2003-11-19 2:47 ` Benjamin Geer 0 siblings, 0 replies; 92+ messages in thread From: Benjamin Geer @ 2003-11-19 2:47 UTC (permalink / raw) To: Martin Berger; +Cc: Caml Mailing List Martin Berger wrote: > my (limited) experience with java suggests that in large projects one of > the following happens: > > * all exceptions specs are written out in detail (ie no grouping using > subtyping etc). in this case, way too much code is nothing but exception > specs. > > * the subtyping approach is used. in this case exception specifications > are too imprecise; > > * something that seems like what you refer to as nested exceptions where > you catch exceptions at every layer and convert them into some other > exception. in this case you litter the code with catch statements > that seem superflouos. The second and third approaches can indeed be taken to an extreme in order to render exceptions completely meaningless. However, I think there are reasonable alternatives. In general, I think the design of exception types should be guided by the need to handle different errors differently. When certain errors occur, a program can usefully try to recover, perhaps by waiting a little while and trying again. Other kinds of errors need to be handled by giving up and letting a person sort it out. Depending on what went wrong, that person might be an ordinary user or a system administrator, and the sort of information they'll want to be given will vary accordingly. Exceptions propagate upwards from low-level subsystems into higher-level ones; at some point, the program must take some action in response to the exception. Often this is most reasonably done at the point where a 'unit of work' (something that would be meaningful to the user) was initiated. At that point, the program doesn't care what the precise reason for the exception was; it only needs to know which sort of action to take. Retry? Pop up a dialog box to tell the user that the input was bad and needs to be corrected? Log the error as a bug with a full stack trace, and send an email to the system administrator? With some programs, the user will want to be able to configure which errors are handled in which ways. This suggests that each type of problem that needs (or may need) distinct error-handling behaviour also needs its own exception type. For some programs, it is enough to define three general exception types: (1) error caused by bad user input, (2) error caused by the failure of some external resource, such as a network or a database, and (3) error caused by a bug in the program (assertion failure). (A subtype of (2), indicating that it may be worth retrying, can be useful.) When a more specific exception (such as an I/O error) is caught by a low-level subsystem, it can be wrapped in an exception of one of these three general types, and allowed to propagate upwards until it reaches the function that initiated the unit of work. That function can pass the exception to a subsystem that knows about the user's error-handling configuration, in order to determine whether to retry or just report the error. The error-handling subsystem can also take care of reporting the error in the appropriate way, according to its type. Since the original low-level exception is still there, wrapped in the more general exception, reporting can be as detailed as needed. For other programs, these simple categories will be insufficient, but you get the idea. Ben ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 15:12 ` skaller 2003-11-18 16:49 ` Martin Berger @ 2003-11-18 22:23 ` Brian Hurt 2003-11-19 13:00 ` skaller 1 sibling, 1 reply; 92+ messages in thread From: Brian Hurt @ 2003-11-18 22:23 UTC (permalink / raw) To: skaller; +Cc: Caml Mailing List On 19 Nov 2003, skaller wrote: > On Tue, 2003-11-18 at 08:20, Brian Hurt wrote: > > On 18 Nov 2003, skaller wrote: > > > > > > But it is irrelevant, because programmers do not intend to write > > > arbitrary code. They intend to write comprehensible code, whose > > > correctness is therefore necessarily easy to prove. > > > > What they intend to do and what they do do are often completely different. > > It is a rare programmer who *intentionally* adds bugs to his code. Yet > > bugs still happen. > > Of course. But my point is that a lot of the arguments > I have seen that 'we cannot do that because the general > problem is NP-complete' are irrelevant. The problem isn't NP-complete, it's unsolvable in the general case. The fact that it's insolvable means that the language can not issue blanket gaurentees, just "best effort" requirements. Which work fine right up until the point that the best effort fails. An example of this problem in action, consider tail recursion optimization. But even here it fails- for most inputs, functions that aren't properly tail recursive still work correctly. And when it doesn't work correctly, it fails in an obvious (and easy to debug) way- a stack overflow. With precise destruction of objects, as you yourself point out, encourages the programmer to depend upon it. When the program hits some point of complexity where best effort fails (and since the problem is unsolvable in the general case, there will be a point where complexity pass this threshold), suddenly the program is struck by subtle, hard to debug bugs (lines being written to files in the wrong order, etc). > > > But my point here is that determining the *exact* time that an object > > becomes garbage can be arbitrarily complex. Most of the time, I agree, it > > won't be. > > The point is though: when it matters, the onus is on > the programmer to make sure the design is able to > determine an 'exact enough' time, using whatever > tools the language may provide. What may be obvious to the programmer may not be obvious to the compiler. To me, the code: let rec append src dst = match src with | [] -> dst | h :: t -> h :: (append t dst) ;; should be tail recursion optimizable. The ocaml compiler, however, doesn't. If the programmer can specify exactly when a resource should be released, then there should be a function he can call at that point to release the resource. At which point the destructor doesn't need to do anything. > An obvious candidate is a pure stack protocol > (automatic store) and another is a hierarchical > system (ref count). A stack based store would be tricky with a functional language, but *might* be doable. It's completely unworkable with an object oriented language. The whole idea of OO programming is that the object interface hides the internal details of the object, including it's size. Reference counting has an unacceptable performance penalty compared to mark and sweep, let alone more advanced garbage collection algorithms. Plus it has problems with circular data structures (which is why most reference counting implementations have a backup mark and sweep). Which is why reference counting went out of vogue in the 1970's. Note that if all else fails, you *can* implement reference counting GC on top of Ocaml (you probably need to do some Obj.magic hacks). > > When there are circularities .. due to the complexity > it would be unwise to count on synchronous finalisation, > since it isn't clear when that is (by definition it > is rather hard to determine ..) My point *exactly*. > > > Once you allow for the fact that finalizers/destructors may not happen in > > a defined order or at defined times, why not go whole-hog? > > Hard question. One job a language is often required to do > is emulate the behaviour of a program written in another > language .. that's not always easy to do when the object, > type, allocation or execution models are different. Which is why languages should be carefull to constrain their implementations as little as possible. Is the problem Ocaml's, for not having the type of GC required by Interscript, or Interscript's, for to tightly defining what type of GC they have? Or programmers, for programming by coincidence ("It worked on my machine...")? This also allows the language to improve it's implementations- for example replacing a reference counting GC with a mark and sweep GC. > > > > In the most intractible cases, neither the compiler nor the programmer may > > be able to determine when the last reference is released. But even in the > > simple cases, the programmer may have the intent to release the resources > > and simply doesn't. It's called a bug. > > I agree. Bugs happen. Perhaps I can give an example: > in Java, variables must be initialised. Unlike Ocaml though, > they do not have to be initialised at the point of > declaration. Java requires instead 'manifest initialisation': > its a compromise between initialisation at the point > of declaration like Ocaml, or arbitrary initialisation > (onus on programmer) as in C++. > > The language picked a constraint that is easy to check > for both the compiler and for humans which is more > flexible than the Ocaml model, but still assures > freedom from unitialised variable errors. It still > can't handle complex cases, which can be done in C++, > but they're rarely needed because if they seem > to be needed there is a good chance the design > is flawed. So long as I can declare variables when I first need them, I've never had a problem with initialization as part of allocation. Using an uninitialized variable is *always* wrong. I will occassionaly not initialize variables in C, because I have to declare all my variables at the begining of the function, and I don't know the correct initial value at that point. In Ocaml, and in Java, I simply move the variable declaration up to where I do know it's initial value. Note that Java is even more imprecise about when objects get freed than Ocaml is. In Ocaml, the GC runs inside the same thread as the main program. In Java, it's specified that the GC runs in a seperate thread. Meaning that destructors execute asynchronous from the main program. If the destructor for object X accesses object Y which is not garbage, and the main thread is also accessing object Y, you have a possible race condition. You can not write non-multithreaded programs in Java. Were I writting the spec for Ocaml, I'd not only allow for the possibility of GC running in it's own thread, I'd also allow for the possibility of multiple threads of GC running in parallel. > Normally I wouldn't. But here there was no choice. > You have to understand first that Interscript is a library > with a driver program that looks like this: > > read a line > if at end of file, delete user space else > if the line starts with @ execute it > otherwise if we're in tangle mode > write to current source file > otheriwse we have to be in document mode > write to current document > repeat > > > Now, the user code looks like this: > > @h = tangler('src/x.ml') > @select(h) > print_endline "I'm an ocaml program" > <EOF> > > Do you see the problem? The open file lives > in USER space. The language syntax does not > require the user close open files, because > that would leave a dangling reference. So when does @h fall out of scope? Other than at eof, obviously. > > But the engine has no idea which things > are files that need closing, and which > are documents that need tables of contents. > > The ONLY way to finalise > each object correctly is in the destructor > method (__del__ method). I could add a > 'finalise' method to each object and call it > but that would not help -- the __del__ > method is precisely that anyhow, > and it gets invoked automatically so it > relieves me of the task of finding > all the objects. > > You can see that the problem arises > from a design decision not to require > the USER to close files. And from the users doing programming by coincidence. Looks like you're going to have to implement reference counting. > > > If it matters when the buffers are flushed, manually flush the buffers. > > See above. It isn't always possible: the program doesn't > always know which files are open. This is quite typical > in object oriented programs -- the program doesn't know > what kinds of objects exist, there is no master > that can call the 'finalise' method. > > In C++ code, particularly GUI code, callbacks often > lead to suicide 'delete this' simply because there > is no one around to delete an object. > Yep. IMHO, OO pretty much demands GC. But if the program doesn't know what is going on, and when it's releasing the last reference to an object, how is the compiler supposed to know? Welcome to the general case. > > > > > > > On thing is for sure .. I *hate* seeing > > > "program terminated with uncaught "Not_found" exception" > > > because I do hundreds of Hashtbl.find calls.... > > > There is a religous debate between returning 'a and throwing an exception > > if the item isn't found, or returning 'a option. > > I don't think it is religious. There is a genuine problem here. > I do not know how to state it exactly, but I'll try. > > Checking error codes and poping the stack is not only > tedious and obscures the normal control flow, > it does so to such an extent in certain cases > as to be totally and unequivocably out of the question. > For example in math calculations you simply cannot > afford to check every single function call either > before or after invocation: > > x = (a + b/c ^ d) > > for example would turn into many lines of spagetti. > I will say: > > "the error detection is too localised" > > for this kind of code. On the other hand the > uncaught exception of dynamic EH clearly > shows that that mechanism leads to > > "the error detection is too global" > > What alternatives are there? > > One is to have exception specifications on functions, > but that is known not to work very well. The first > difficulty is that once you have them, they must > become part of the type system. They then 'snowball' > through the whole program (in the same way 'const' > does to C programs). Type inference is your friend here, as it relieves the programmer of a lot of burden of handling more complex types. But the fundamental problem is that we want the programmer to think about and handle error cases, and many programmers doesn't want to as that's extra work. > > It isn't possible to deduce what exceptions can > be thrown when functions are passed as arguments > to functions, so this pollution of the type system > would also be manifest in code in the form > of annotations .. basically higher order functions > would be screwed completely by this. No. If a function is defined to take an argument of type "unit -> int", and you try to pass it a function of type "unit -> int throws Not_found" (or whatever the syntax is), this is a type error. The other direction should be ok, however. Some generality would be needed. You'd want to be able to express a function type like: val foo: (unit -> int throws 'a) -> int throws 'a > > So exception specifications are out for > 'engineering' reasons. > > Another alternative is static exception handling. > I tried that in Felix. It is very good for a > wide class of problems. Static EH implies > block structured code with the handler visible > from the throw point... its really a structured goto. I'd have to take a look at this. > > This solves many cases where we really wanted > 'alternate control flow constructs' rather > than error handling, but not all. And it doesn't > work where a more dynamic transmission of > error notifications is required. > > What else? Continuations? Can monads be used? > We really do need a mechanism with better > locality (easily control scope). > > -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-18 22:23 ` Brian Hurt @ 2003-11-19 13:00 ` skaller 0 siblings, 0 replies; 92+ messages in thread From: skaller @ 2003-11-19 13:00 UTC (permalink / raw) To: Brian Hurt; +Cc: Caml Mailing List On Wed, 2003-11-19 at 09:23, Brian Hurt wrote: > On 19 Nov 2003, skaller wrote: > The fact that it's insolvable means that the language can not issue > blanket gaurentees, just "best effort" requirements. Which work fine > right up until the point that the best effort fails. That isn't the way I'm looking at it. Let me try to explain better. Consider you have a general case and some subclass of the problem which is easily solvable. What you do is restrict the language to the subclass: the general case is not useful precisely because it is incomprehensible, meaning you cannot easily form judgements about it. I want to give a real example of this. We all know that proving an arbitrary program terminates is impossible, where the programming system is Turing complete. My point is: who cares?? We clearly do NOT need Turing complete programming systems, because the behaviour is unpredictable. Programs are meant to have predictable behaviour. So you just restrict the language so that you CAN prove with some algorithm that every program terminates (or not). Oh yes, it has been done, and the language is very powerful (though not Turing complete). See Charity: all Charity programs terminate. True, some things you can write in Ocaml cannot be expressed in Charity. Also true: some things you can write in Python cannot be expressed in Ocaml (because the type system is not expressive enough whereas Python doesn't care). Yet, Ocaml is still useful for a large class of problems .. and one really has to push hard to find an application where the full power of dynamic typing is really needed. So I think really the issue here is trying to find a compromise between the ability to reason about something and the ability to express it: if the constraints are too tight reasoning is easy but irrelevant because you cant solve any problems, and if it is too expressive you can't reason and also cannot solve problems [in the sense that whilst the program appears to work you don't have any assurances from the system it will] Of course that balance is found by research, and new results allow the frontier to be pushed. In the case of Ocaml there is a strong emphasis on the type system as a means of constraining code [the aim is not to be sure the code works, but to gain *some* reasonable assurance -- in practice Ocaml strong typing is extremely good at predicting when my programs will work -- if it compiles it either works immediately of has only a few bugs in it] ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* OCaml popularity [was: Re: [Caml-list] GC and file...] 2003-11-17 18:15 ` skaller 2003-11-17 19:26 ` Aleksey Nogin 2003-11-17 21:20 ` Brian Hurt @ 2003-11-17 22:37 ` John J Lee 2003-11-18 1:02 ` [Caml-list] Re: GC and file descriptors Jed Davis 3 siblings, 0 replies; 92+ messages in thread From: John J Lee @ 2003-11-17 22:37 UTC (permalink / raw) To: Caml Mailing List On Mon, 18 Nov 2003, skaller wrote: [...] > I think Ocaml is very close to mainstream now. [...] Well... depends how you define mainstream I guess. Plenty of professional programmers have never heard of functional programming, to say nothing of O'Caml. And in terms of raw popularity, it certainly seems very far from mainstream. If Google is any judge (it's certainly not perfect, but for order-of-magnitude it'll do), O'Caml isn't even in the "top 30": even if you lump in ML and SML and include all the various spellings, Caml doesn't get even a fifth as many hits for "X programming language" as Scheme. Scheme itself is hardly considered mainstream by J. Random Hacker, being roughly a factor of twenty off Java's Google-measured popularity. http://www.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=MjQpb.92077%24e5.3389981%40news1.tin.it&rnum=1&prev=/groups%3Fq%3Dgroup:comp.lang.python%2BMartelli%2BMathematica%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3DMjQpb.92077%2524e5.3389981%2540news1.tin.it%26rnum%3D1 http://tinyurl.com/veuo John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* [Caml-list] Re: GC and file descriptors 2003-11-17 18:15 ` skaller ` (2 preceding siblings ...) 2003-11-17 22:37 ` OCaml popularity [was: Re: [Caml-list] GC and file...] John J Lee @ 2003-11-18 1:02 ` Jed Davis 3 siblings, 0 replies; 92+ messages in thread From: Jed Davis @ 2003-11-18 1:02 UTC (permalink / raw) To: caml-list skaller <skaller@ozemail.com.au> writes: > On Mon, 2003-11-17 at 06:19, Brian Hurt wrote: >> Of >> course, Java's type system is state of the art- for 1968. > > Err.. since when is downcasting everthing from Object > a type system?? There is a type system. It just doesn't have anything more advanced than nominal[*] subtyping, which is why all the dynamically checked downcasting is needed. [*] As opposed to structural. -- Jed Davis <jldavis@cs.oberlin.edu> Selling of self: http://panix.com/~jdev/rs/ <jdev@panix.com> PGP<-finger A098:903E:9B9A:DEF4:168F:AA09:BF07:807E:F336:59F9 \ "But life wasn't yes-no, on-off. Life was shades of gray, and rainbows /\ not in the order of the spectrum." -- L. E. Modesitt, Jr., _Adiamante_ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Caml-list] GC and file descriptors 2003-11-13 0:50 [Caml-list] GC and file descriptors Dustin Sallings 2003-11-13 1:18 ` David Fox @ 2003-11-13 1:19 ` Nicolas George [not found] ` <87smkstkhg.fsf@igloo.phubuh.org> 2 siblings, 0 replies; 92+ messages in thread From: Nicolas George @ 2003-11-13 1:19 UTC (permalink / raw) To: Dustin Sallings, Caml mailing list [-- Attachment #1: Type: text/plain, Size: 291 bytes --] Le tridi 23 brumaire, an CCXII, Dustin Sallings a écrit : > let r = Unix.open_process_in ("tput " ^ x) > and buf = String.create 8 in > let len = input r buf 0 8 in > close_in r; BTW, you should use Unix.close_process_in and not close_in, or you will leave zombies. [-- Attachment #2: Type: application/pgp-signature, Size: 185 bytes --] ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <87smkstkhg.fsf@igloo.phubuh.org>]
[parent not found: <347A7A46-1612-11D8-8F93-000393CFE6B8@spy.net>]
* Re: [Caml-list] GC and file descriptors [not found] ` <347A7A46-1612-11D8-8F93-000393CFE6B8@spy.net> @ 2003-11-13 20:18 ` Mikael Brockman 0 siblings, 0 replies; 92+ messages in thread From: Mikael Brockman @ 2003-11-13 20:18 UTC (permalink / raw) To: caml-list On Thu, 2003-11-13 at 20:47, Dustin Sallings wrote: > On Nov 13, 2003, at 6:24, Mikael Brockman wrote: > > > \If the in_channel is heap allocated, you can do > > > > let close o = ignore (Unix.close_process_in o) > > > > let open_process_in str = > > let r = Unix.open_process_in str in > > Gc.finalise close r; > > r > > > > let tput x = > > let buf = String.create 8 in > > String.sub buf 0 (input (open_process_in ("tput " ^ x)) buf 0 8) > > > > If it is not, you could probably create a wrapper type that has to be > > heap allocated. > > That's very interesting. You answered some questions, but brought up > some new ones. Are there things that are not heap allocated, and how > will I recognize these? Yes, there are values that are not heap allocated. The Gc manual has this to say: > Some examples of values that are not heap-allocated are integers, > constant constructors, booleans, the empty array, the empty list, the > unit value. The exact list of what is heap-allocated or not is > implementation-dependent. Some constant values can be heap-allocated > but never deallocated during the lifetime of the program, for example > a list of integer constants; this is also implementation-dependent. > You should also be aware that compiler optimisations may duplicate > some immutable values, for example floating-point numbers when stored > into arrays, so they can be finalised and collected while another copy > is still in use by the program. I think it is pretty safe to assume that objects are heap allocated, so if the function passing idiom suggested earlier is insufficient, you can wrap the Unix process in an object that sets a finalizer in initialization. let close_unix_process o = prerr_endline "closing unix process"; ignore (o#close ()) class unix_process cmd = object (self) val stream = Unix.open_process_in cmd initializer Gc.finalise close_unix_process self method input = input stream method close () = Unix.close_process_in stream end let read_some proc = let buf = String.create 2048 in String.sub buf 0 (proc#input buf 0 2048) > # read_some (new unix_process "ls /");; > - : string = > "bin\nboot\ndev\netc\nhome\nlib\nmnt\nproc\nroot\nsbin\ntmp\nusr\nvar\n" > # Gc.full_major ();; > closing unix process > - : unit = () > # Also, I don't think the default signal handler for SIGKILL forces a major cycle, but that's easily fixed with Sys.signal or Sys.set_signal. -- Mikael Brockman <phubuh@phubuh.org> ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 92+ messages in thread
end of thread, other threads:[~2003-11-22 2:42 UTC | newest] Thread overview: 92+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-11-13 0:50 [Caml-list] GC and file descriptors Dustin Sallings 2003-11-13 1:18 ` David Fox 2003-11-13 4:09 ` Dustin Sallings 2003-11-14 13:42 ` Damien Doligez 2003-11-14 14:57 ` Christophe Raffalli 2003-11-14 20:24 ` Dmitry Bely 2003-11-14 20:54 ` Eric Dahlman 2003-11-14 22:21 ` Brian Hurt 2003-11-14 21:36 ` John J Lee 2003-11-14 21:48 ` Brian Hurt 2003-11-15 1:47 ` Dmitry Bely 2003-11-15 2:25 ` Max Kirillov 2003-11-15 2:49 ` Mike Furr 2003-11-16 4:09 ` [Caml-list] Bugs from ignoring errors from close (was Re: GC and file..) Tim Freeman 2003-11-15 2:58 ` [Caml-list] GC and file descriptors David Brown 2003-11-17 14:19 ` Damien Doligez 2003-11-17 18:18 ` skaller 2003-11-14 18:35 ` Dustin Sallings 2003-11-15 14:16 ` skaller 2003-11-15 15:56 ` Ville-Pertti Keinonen 2003-11-15 17:30 ` skaller 2003-11-15 20:31 ` Martin Berger 2003-11-16 19:19 ` Brian Hurt 2003-11-17 18:15 ` skaller 2003-11-17 19:26 ` Aleksey Nogin 2003-11-18 13:49 ` skaller 2003-11-18 17:51 ` Dustin Sallings 2003-11-18 20:17 ` Aleksey Nogin 2003-11-20 7:36 ` Florian Hars 2003-11-17 21:20 ` Brian Hurt 2003-11-17 23:02 ` John J Lee 2003-11-18 12:05 ` Ville-Pertti Keinonen 2003-11-18 15:19 ` skaller 2003-11-18 18:10 ` John J Lee 2003-11-18 17:55 ` skaller 2003-11-18 20:02 ` Ville-Pertti Keinonen 2003-11-18 21:20 ` John J Lee 2003-11-19 12:25 ` skaller 2003-11-19 13:55 ` Ville-Pertti Keinonen 2003-11-19 14:26 ` Samuel Lacas 2003-11-19 14:47 ` skaller 2003-11-18 15:28 ` skaller 2003-11-18 18:00 ` John J Lee 2003-11-18 22:28 ` Brian Hurt 2003-11-18 23:07 ` John J Lee 2003-11-18 23:22 ` Benjamin Geer 2003-11-19 1:49 ` Martin Berger 2003-11-19 3:57 ` Dustin Sallings 2003-11-19 13:35 ` skaller 2003-11-19 13:00 ` skaller 2003-11-19 13:02 ` skaller 2003-11-19 17:36 ` Brian Hurt 2003-11-20 5:14 ` skaller 2003-11-20 7:37 ` David Brown 2003-11-18 15:12 ` skaller 2003-11-18 16:49 ` Martin Berger 2003-11-18 17:46 ` skaller 2003-11-19 1:33 ` Martin Berger 2003-11-19 3:19 ` Design by Contract, was " Brian Hurt 2003-11-19 2:57 ` Jacques Carette 2003-11-19 13:27 ` skaller 2003-11-19 14:41 ` Martin Berger 2003-11-19 16:54 ` Richard Jones 2003-11-19 17:18 ` Damien Doligez 2003-11-19 21:45 ` Richard Jones 2003-11-19 23:09 ` Benjamin Geer 2003-11-20 0:50 ` Nicolas Cannasse 2003-11-20 9:42 ` Benjamin Geer 2003-11-19 18:03 ` Martin Berger 2003-11-18 18:26 ` Benjamin Geer 2003-11-18 19:24 ` Xavier Leroy 2003-11-18 23:49 ` Benjamin Geer 2003-11-19 1:36 ` Martin Berger 2003-11-19 2:28 ` Nicolas Cannasse 2003-11-19 3:26 ` Brian Hurt 2003-11-19 11:44 ` Martin Berger 2003-11-19 17:29 ` Brian Hurt 2003-11-20 5:17 ` skaller 2003-11-20 16:13 ` Brian Hurt 2003-11-19 13:33 ` skaller 2003-11-19 17:01 ` Richard Jones 2003-11-22 2:39 ` [Caml-list] AutoMLI (Was: GC and file descriptors) Jim 2003-11-19 17:43 ` [Caml-list] GC and file descriptors Brian Hurt 2003-11-20 5:05 ` skaller 2003-11-19 1:33 ` Martin Berger 2003-11-19 2:47 ` Benjamin Geer 2003-11-18 22:23 ` Brian Hurt 2003-11-19 13:00 ` skaller 2003-11-17 22:37 ` OCaml popularity [was: Re: [Caml-list] GC and file...] John J Lee 2003-11-18 1:02 ` [Caml-list] Re: GC and file descriptors Jed Davis 2003-11-13 1:19 ` [Caml-list] " Nicolas George [not found] ` <87smkstkhg.fsf@igloo.phubuh.org> [not found] ` <347A7A46-1612-11D8-8F93-000393CFE6B8@spy.net> 2003-11-13 20:18 ` Mikael Brockman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox