* Re: GC with finalisation? [not found] <3.0.6.32.19990830144523.00964e20@mail.triode.net.au> @ 1999-08-30 12:31 ` Markus Mottl 0 siblings, 0 replies; 7+ messages in thread From: Markus Mottl @ 1999-08-30 12:31 UTC (permalink / raw) To: John Skaller; +Cc: OCAML > >No, you need not care about references, because this will be done by the > >GC of OCaml. > > We must be firing past each other. I have to 'model' > all the references one python object has to other python > objects using the Syms. No python object will be collected > directly by ocaml, since they're all registered in a strong > array, just to prevent this! It's not the objects which are collected - it's their "Syms". The objects may exist anywhere - but must exist in the finalizing closures of the strong array. > Instead, every binding/unbinding of references > in the python objects has to be modelled using the Syms, > so that inspecting the weak array of Syms, which ARE > collected by ocaml, will allow me to invoke the finaliser, > and THEN remove the final strong reference, to allow the > now finalised object to be collected by ocaml. > > I can't just make a weak array of Syms, without > the Syms refering to each other, because they'd > all get collected immediately! Where is the problem when several Syms get collected at once? As long as objects (better: their handlers) get registered in the right order, they will be finalized in the right order: simply, because the objects contain a "Sym" to other objects (via the closure!) and thus, not even the handlers of the other objects will be reclaimed before the containing object! This can, of course, lead to cycles which will never be reclaimed by the OCaml-GC when the objects in the closures reference themselves mutually. But this can be easily resolved by a "meta-GC": You finalize all objects of reclaimed handlers. Then you put all objects that are still available into a set. Now you traverse all reachable objects from the root and fold'n'filter them out of the set. The remaining set contains all objects which should be reclaimed but reference themselves mutually -> these can now be removed from the weak array and be finalized... This last and a bit more time consuming pass (finding of cyclic structures) needs not be conducted every time you do "normal" finalization. You could use a "countdown" until this "cycle collection" is launched. > Right. And the 'Syms' will depend only on other Syms: > > type Sym = Sym list ref;; I have rather thought about something like: type 'a sym = Sym of 'a class some_python_class = object method finalize = print_endline "Finalized!" end let weak_ar = Weak.create 100 let strong_ar = Array.create 100 (Obj.magic 0) let ar_size = ref (-1) let register () = let my_obj = new some_python_class in let obj_handler = Sym my_obj in Weak.set weak_ar 0 (Some obj_handler); Array.set strong_ar 0 (fun _ -> my_obj#finalize); incr ar_size; obj_handler let finalize () = for i = 0 to !ar_size do if not (Weak.check weak_ar i) then strong_ar.(i) () done let _ = ignore (register ()); (* Discards object_handler *) Gc.full_major (); finalize () This is a fast hack - eg. "finalize" doesn't remove the closure after finalization, etc... But the main idea should be clear. If you "register" the objects in the order of their creation (naturally), they will also be finalized in the right order if you traverse the weak array. Note: if you remove the "ignore" in the main function and replace it by some binding, the object will not get finalized! - This proves that it is really due to the garbage collection that finalization happens. The "full major collection" is called to force the object handler (Sym) to be reclaimed. You need not do this during normal execution, but you will have to before the program exits as it immediately does in this small example. Ah, and you could even catch exceptions in function "finalize"! > >This guarantees that reclaimed elements of the weak array can be finalized > >at once when traversing the array. > > The dependencies change dynamically, so it is not > possible to order the Sym array like this: the ordering is not > invariant. The dependencies exist through object handlers - as long as the "real object" is not reclaimed (it won't be as long as there is a closure containing it), it will not "release" depending objects (better: their "Sym"-handle). Thus, it is guaranteed that finalization respects this order. Time of object creation is "only" the second "finalization order criterion"... Regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl ^ permalink raw reply [flat|nested] 7+ messages in thread
* GC with finalisation? @ 1999-08-24 3:36 John Skaller 1999-08-26 21:09 ` Markus Mottl 1999-08-27 7:13 ` STARYNKEVITCH Basile 0 siblings, 2 replies; 7+ messages in thread From: John Skaller @ 1999-08-24 3:36 UTC (permalink / raw) To: caml-list I could use some advice on this. I'm implementing a Python interpreter/compiler in Ocaml. Python class instances support __del__ methods -- that is, destructors. CPython uses reference counting for collection and finalisation. JPython leverages Java's collector and finalisation semantics. My current ocaml implementation leverages the ocaml collector, which doesn't support finalisation (as far as I know). I seem to have several choices. 1) implement reference counting for finalisation * disadvantage: same problem as CPython: doesn't resolve circular references properly * disadvantage: overhead and coding complexity * advantage: synchronous finalisation 2) Implement a full resource collector * disadvantage: overhead and complexity * disadvantage: probably won't support synchronous finalisation 3) provide BOTH reference counting and a collector (as Perl does) * disadvantage: overhead and complexity * advantage: finalises everything * advantage: synchonous finalisation in all cases that CPython provides it I am interested in any comments on how to solve this problem. Order of finalisation can be important in Python. I have a major literate programming tool (Interscript) written in Python, the compiler/interpreter is being built to make it go faster. I use finalisation semantics extensively (tables of information are generated at the end of document processing, etc). While I might be able to rewrite it to avoid automatic finalisation, it would be nice to support the wide class of other Python programs that rely on it. ------------------------------------------------------- John Skaller email: skaller@maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: GC with finalisation? 1999-08-24 3:36 John Skaller @ 1999-08-26 21:09 ` Markus Mottl 1999-08-27 5:10 ` John Skaller 1999-08-27 7:13 ` STARYNKEVITCH Basile 1 sibling, 1 reply; 7+ messages in thread From: Markus Mottl @ 1999-08-26 21:09 UTC (permalink / raw) To: John Skaller; +Cc: OCAML > I seem to have several choices. > > 1) implement reference counting for finalisation > > * disadvantage: same problem as CPython: doesn't resolve > circular references properly > > * disadvantage: overhead and coding complexity > > * advantage: synchronous finalisation The main problem with reference counting is speed. It is possible to resolve circular references in linear time, but the constant overhead and the associated memory requirements are simply too large. Anyway, there is a nice fortune cookie pointing out potential problems with recursive datastructures when doing reference counting: One day a student came to Moon and said, "I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons." Moon patiently told the student the following story -- "One day a student came to Moon and said, "I understand how to make a better garbage collector..." [snip GC approaches] I would leave all the work to the OCaml-GC: use module "Weak" to register values (eg.: objects) to be finalized. Then check these arrays from time to time and call appropriate finalization functions for objects that have been reclaimed. Regards, Markus -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: GC with finalisation? 1999-08-26 21:09 ` Markus Mottl @ 1999-08-27 5:10 ` John Skaller 0 siblings, 0 replies; 7+ messages in thread From: John Skaller @ 1999-08-27 5:10 UTC (permalink / raw) To: Markus Mottl; +Cc: OCAML At 22:09 26/08/99 +0100, Markus Mottl wrote: >> I seem to have several choices. >> 1) implement reference counting for finalisation >The main problem with reference counting is speed. I was hoping to avoid it .. >It is possible to >resolve circular references in linear time, but the constant overhead >and the associated memory requirements are simply too large. I think this is less of a problem than actually writing the collector, and then changing all my code to use it. >I would leave all the work to the OCaml-GC: use module "Weak" to register >values (eg.: objects) to be finalized. Then check these arrays from >time to time and call appropriate finalization functions for objects >that have been reclaimed. That is an interesting idea -- but it won't work, because the destructor must be called _before_ the object is destroyed by the system. For example, one of my objects maintains the table of contents for a document, and when the object becomes unreachable, it is printed to a file as HTML. Obviously, the object still has to exist at this point, and retain all attributes. However, your idea of using weak arrays may be useful after all, if I can create a 'symbol' for the object, and put that into the weak array: the symbols contains lists of symbols, representing objects the object depends on. The collector will now collect symbols, and by checking the weak array I can execute destructors, and then release it for collection (by removing it from a corresponding 'strong' array). The problem is: it will be slower than reference counting, and not be entirely synchronous. The advantage is it will finalise unreachable objects with circular references. Also, reference counting can be added anyhow, to make non-circular finalisation synchronous. Hmm. Comments? ------------------------------------------------------- John Skaller email: skaller@maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: GC with finalisation? 1999-08-24 3:36 John Skaller 1999-08-26 21:09 ` Markus Mottl @ 1999-08-27 7:13 ` STARYNKEVITCH Basile 1999-08-28 4:29 ` John Skaller 1999-08-28 21:49 ` Markus Mottl 1 sibling, 2 replies; 7+ messages in thread From: STARYNKEVITCH Basile @ 1999-08-27 7:13 UTC (permalink / raw) To: John Skaller; +Cc: caml-list [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1919 bytes --] >>>>> "John" == John Skaller <skaller@maxtal.com.au> writes: John> I could use some advice on this. I'm implementing a Python John> interpreter/compiler in Ocaml. Python class instances John> support __del__ methods -- that is, destructors. 3 choices are then proposed. I suggest a fourth one: 4) patch the existing ocaml 2.02 garbage collector to allow finalized objects containing (for example) a single Ocaml value: this is not very difficult; you'll have to create a new tag, i.e. add #define Finalwithvalue_tag 249 to mlvalues.h then grep for Final_tag in the source, and add the case for Finalwithvalue_tag (which should handle as value its second word after the header, the first being the finalizer). Notice that ocaml finalizer (including those alreday provided thru Final_tag) should *not* allocate any value in the Ocaml heap (in contrast with Java finalizers). Your python finalizer could, for example, add the just finalized Python object to a list -which is malloc-ed and free-d in C- (or perhaps send it on a pipe to yourself) which should be scanned at appropriate moments. On the other hand, I consider that finalized objects supported by the language (such as provided by Java & Python) are a mistake. In my opinion, finalized objects should not have a user-provided finalizer, but should only deal with system resources (file descriptors, X11 windows, coded only in C) N.B. Any opinions expressed here are only mine, and not of my organization. N.B. Les opinions exprimees ici me sont personnelles et n engagent pas le CEA. --------------------------------------------------------------------- Basile STARYNKEVITCH ---- Commissariat à l Energie Atomique DTA/LETI/DEIN/SLA * CEA/Saclay b.528 (p111f) * 91191 GIF/YVETTE CEDEX * France phone: 1,69.08.60.55; fax: 1.69.08.83.95 home: 1,46.65.45.53 email: Basile point Starynkevitch at cea point fr ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: GC with finalisation? 1999-08-27 7:13 ` STARYNKEVITCH Basile @ 1999-08-28 4:29 ` John Skaller 1999-08-28 21:49 ` Markus Mottl 1 sibling, 0 replies; 7+ messages in thread From: John Skaller @ 1999-08-28 4:29 UTC (permalink / raw) To: STARYNKEVITCH Basile; +Cc: caml-list At 09:13 27/08/99 +0200, STARYNKEVITCH Basile wrote: >>>>>> "John" == John Skaller <skaller@maxtal.com.au> writes: > > John> I could use some advice on this. I'm implementing a Python > John> interpreter/compiler in Ocaml. Python class instances > John> support __del__ methods -- that is, destructors. > >3 choices are then proposed. I suggest a fourth one: > > 4) patch the existing ocaml 2.02 garbage collector to allow finalized > objects containing (for example) a single Ocaml value: > >this is not very difficult; I had thought of that, but I didn't include it in my list for two reasons: 1) I do not have the experience to undertake this task without losing confidence in the resulting system. I have that confidence now because I know the existing collector was written by experts and has been tested by many users. 2) It will mean my clients cannot build my product, without installing and compiling a patched version of Ocaml: bytecode versions won't work either. >you'll have to create a new tag, i.e. add > > #define Finalwithvalue_tag 249 > >to mlvalues.h > > then grep for Final_tag in the source, and add the case for > Finalwithvalue_tag (which should handle as value its second word > after the header, the first being the finalizer). > >Notice that ocaml finalizer (including those alreday provided thru >Final_tag) should *not* allocate any value in the Ocaml heap (in >contrast with Java finalizers). I see. This could be a problem in the first instance, since the __del__ methods of Python class instances can execute arbitrary code, invoking my ocaml Python interpreter: this will certainly cause heap allocations. [This does not rule out a solution using indirection] >On the other hand, I consider that finalized objects supported by the >language (such as provided by Java & Python) are a mistake. I have a strong tendancy to agree with you; but I have a problem implementing a language which does support automatic finalisation, which I myself have used heavily. So I need some kind of solution anyhow. I have a tendancy _not_ to patch ocaml for just this reason: however, I do wonder how other people handle finalisation issues; including (but not limited to) releasing system resources. >In my opinion, finalized objects should not have a user-provided finalizer, >but should only deal with system resources (file descriptors, X11 >windows, coded only in C) That has the disadvantage that it only handles system resources for which you have a suitable implementation. It makes sense to create 'user defined' resources, and manage them in ocaml. It would seem that a module defining a 'generic' collector with finalisation could help solve this and other related problems; without compromising the ocaml memory collector. Such a collector would have to be 'layered ontop' of a system, instead of leveraging the ocaml collector, but that would probably be an advantage in many ways: collecting non-memory resources may well require different semantics than optimised memory collection. ------------------------------------------------------- John Skaller email: skaller@maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: GC with finalisation? 1999-08-27 7:13 ` STARYNKEVITCH Basile 1999-08-28 4:29 ` John Skaller @ 1999-08-28 21:49 ` Markus Mottl 1 sibling, 0 replies; 7+ messages in thread From: Markus Mottl @ 1999-08-28 21:49 UTC (permalink / raw) To: STARYNKEVITCH Basile; +Cc: OCAML > On the other hand, I consider that finalized objects supported by the > language (such as provided by Java & Python) are a mistake. In my > opinion, finalized objects should not have a user-provided finalizer, > but should only deal with system resources (file descriptors, X11 > windows, coded only in C) I once asked for finalizers (for objects) in this list, but I was pointed to problems which I hadn't thought about: for example, user defined finalizers might place the object to be reclaimed in a (global) reference again - what is the garbage collector supposed to do in such cases? Other kinds of side effects (eg. calling methods of other objects) could lead to similar problems. Another argument against finalization is the following: if your program enters a very time-consuming loop in which it doesn't allocate anything, the garbage collector will not be called and the finalization of values/objects might be delayed for a long time. Actually, this is even a problem in languages with stack allocated values: as long as the value is not out of scope, the destructor will not be called (unless the compiler manages to see that the object can be safely finalized and deallocated earlier - which is probably more than tough). Concurrent environments (eg. when doing network communication) may be adversly effected by such delays. Developers of libraries have no control about this, because this problem depends on the way the programmer makes use of values/objects. There are other means to "finalize" values: instead of using destructors/finalizers, you could provide for higher-order functions that do the job. Eg. you want that a network connection is opened, used in some way and closed again (but immediately!). This could look as follows: let do_connection f = let con = open_connection (...) in f con; close_connection con If you provide only such functions to the user, he doesn't have to care about closing connections or do some other finalization and he still has the guarantee that connections are closed immediately - which might allow the server on the other side of the connection to do something more useful than wait... Regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~1999-08-30 13:57 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <3.0.6.32.19990830144523.00964e20@mail.triode.net.au> 1999-08-30 12:31 ` GC with finalisation? Markus Mottl 1999-08-24 3:36 John Skaller 1999-08-26 21:09 ` Markus Mottl 1999-08-27 5:10 ` John Skaller 1999-08-27 7:13 ` STARYNKEVITCH Basile 1999-08-28 4:29 ` John Skaller 1999-08-28 21:49 ` Markus Mottl
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox