* [Caml-list] Dynlink plugin reevaluates modules of main program @ 2018-11-08 10:05 Frédéric Fort 2018-11-08 10:23 ` Nicolás Ojeda Bär ` (2 more replies) 0 siblings, 3 replies; 5+ messages in thread From: Frédéric Fort @ 2018-11-08 10:05 UTC (permalink / raw) To: caml-list Hello, I have an existing program and would like to allow to extend it's functionalities with plugins. If I simplify my code structure it looks as follows: - a.ml : "main module" of the program - b.ml : additional definitions used in a.ml - c.ml : interface for plugins (a collection of function refs) - d.ml : plugin I would like to load Now, d.ml uses values defined in b.ml. Some of them are of type string ref and it seems that the code of b.ml is reevaluated when I call Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty string. Is there a way to prevent this from happening ? Using allow_only and prohibit is not an option, since multiple plugins would each reevaluate C and undo each others modifications. Yours sincerely, Frédéric Fort P.S.: Here follows a minimal working example. I compiled it with ocamlbuild -use-ocamlfind -lib dynlink a.native ocamlbuild -use-ocamlfind d.cmxs a.ml: open Format let _ = B.str := "abc"; printf "%s\n" !B.str; begin try Dynlink.loadfile "./_build/d.cmxs" with Dynlink.Error err -> failwith (Dynlink.error_message err) end; printf "%s\n" !B.str; match !C.f with | Some(f) -> printf "%s\n" (f 0) | None -> () b.ml: let str = ref "" c.ml: let f : (int -> string) option ref = ref None d.ml: let _ = C.f := Some((fun x -> !B.str^(string_of_int x))) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Dynlink plugin reevaluates modules of main program 2018-11-08 10:05 [Caml-list] Dynlink plugin reevaluates modules of main program Frédéric Fort @ 2018-11-08 10:23 ` Nicolás Ojeda Bär 2018-11-08 10:32 ` Gabriel Scherer 2018-11-08 13:02 ` Ivan Gotovchits 2 siblings, 0 replies; 5+ messages in thread From: Nicolás Ojeda Bär @ 2018-11-08 10:23 UTC (permalink / raw) To: frederic.fort; +Cc: OCaml Mailing List Dear Frédéric, The reason for this behaviour is that `ocamlbuild` is linking b.cmx (and c.cmx for that matter) into your plugin, when in fact you want it to use the copy of B (and C) linked to your main program. You can see the right behaviour if you build your plugin by hand, e.g. by doing ocamlopt -shared d.cmx d.cmxs and run a.native. If I remember correctly you need to use a .mldylib or .mllib file to tell `ocamlbuild` explicitly which modules to link into your plugin (in your case, just B). Someone more knowledgeable about `ocamlbuild` may want to chime in. Hope it helps! Best wishes, Nicolás On Thu, Nov 8, 2018 at 11:05 AM Frédéric Fort <frederic.fort@univ-lille.fr> wrote: > > Hello, > > I have an existing program and would like to allow to extend it's functionalities with plugins. > If I simplify my code structure it looks as follows: > - a.ml : "main module" of the program > - b.ml : additional definitions used in a.ml > - c.ml : interface for plugins (a collection of function refs) > - d.ml : plugin I would like to load > > Now, d.ml uses values defined in b.ml. Some of them are of type string ref > and it seems that the code of b.ml is reevaluated when I call > Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty string. > > Is there a way to prevent this from happening ? > Using allow_only and prohibit is not an option, since multiple plugins would each reevaluate C > and undo each others modifications. > > Yours sincerely, > Frédéric Fort > > P.S.: Here follows a minimal working example. > I compiled it with > ocamlbuild -use-ocamlfind -lib dynlink a.native > ocamlbuild -use-ocamlfind d.cmxs > > a.ml: > open Format > > let _ = > B.str := "abc"; > printf "%s\n" !B.str; > begin > try > Dynlink.loadfile "./_build/d.cmxs" > with Dynlink.Error err -> > failwith (Dynlink.error_message err) end; > printf "%s\n" !B.str; > match !C.f with > | Some(f) -> printf "%s\n" (f 0) > | None -> () > > b.ml: > let str = ref "" > > c.ml: > let f : (int -> string) option ref = ref None > > d.ml: > let _ = > C.f := Some((fun x -> !B.str^(string_of_int x))) > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list https://inbox.ocaml.org/caml-list > Forum: https://discuss.ocaml.org/ > Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Dynlink plugin reevaluates modules of main program 2018-11-08 10:05 [Caml-list] Dynlink plugin reevaluates modules of main program Frédéric Fort 2018-11-08 10:23 ` Nicolás Ojeda Bär @ 2018-11-08 10:32 ` Gabriel Scherer 2018-11-08 13:02 ` Ivan Gotovchits 2 siblings, 0 replies; 5+ messages in thread From: Gabriel Scherer @ 2018-11-08 10:32 UTC (permalink / raw) To: frederic.fort; +Cc: caml users [-- Attachment #1: Type: text/plain, Size: 2381 bytes --] Ocamlbuild knows of two ways to build .cmxs files: - from a .cmxa file, repackaged for dynamic loading - from a .mldylib, listing the modules to be included (like .mllib for .cm{x,}a) (See "ocamlbuild -documentation | grep cmxs" to check this) Your build uses the first approach, and this results in d *and its dependencies* being packaged in d.cmxa and then d.cmxs. (See the log, or use -classic-display to see the commands). The behavior you expect requires the second approach. You can obtain it by adding d.mldylib: D On Thu, Nov 8, 2018 at 11:06 AM Frédéric Fort <frederic.fort@univ-lille.fr> wrote: > Hello, > > I have an existing program and would like to allow to extend it's > functionalities with plugins. > If I simplify my code structure it looks as follows: > - a.ml : "main module" of the program > - b.ml : additional definitions used in a.ml > - c.ml : interface for plugins (a collection of function refs) > - d.ml : plugin I would like to load > > Now, d.ml uses values defined in b.ml. Some of them are of type string ref > and it seems that the code of b.ml is reevaluated when I call > Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty string. > > Is there a way to prevent this from happening ? > Using allow_only and prohibit is not an option, since multiple plugins > would each reevaluate C > and undo each others modifications. > > Yours sincerely, > Frédéric Fort > > P.S.: Here follows a minimal working example. > I compiled it with > ocamlbuild -use-ocamlfind -lib dynlink a.native > ocamlbuild -use-ocamlfind d.cmxs > > a.ml: > open Format > > let _ = > B.str := "abc"; > printf "%s\n" !B.str; > begin > try > Dynlink.loadfile "./_build/d.cmxs" > with Dynlink.Error err -> > failwith (Dynlink.error_message err) end; > printf "%s\n" !B.str; > match !C.f with > | Some(f) -> printf "%s\n" (f 0) > | None -> () > > b.ml: > let str = ref "" > > c.ml: > let f : (int -> string) option ref = ref None > > d.ml: > let _ = > C.f := Some((fun x -> !B.str^(string_of_int x))) > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > https://inbox.ocaml.org/caml-list > Forum: https://discuss.ocaml.org/ > Bug reports: http://caml.inria.fr/bin/caml-bugs [-- Attachment #2: Type: text/html, Size: 4010 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Dynlink plugin reevaluates modules of main program 2018-11-08 10:05 [Caml-list] Dynlink plugin reevaluates modules of main program Frédéric Fort 2018-11-08 10:23 ` Nicolás Ojeda Bär 2018-11-08 10:32 ` Gabriel Scherer @ 2018-11-08 13:02 ` Ivan Gotovchits 2018-11-15 16:35 ` Nicolás Ojeda Bär 2 siblings, 1 reply; 5+ messages in thread From: Ivan Gotovchits @ 2018-11-08 13:02 UTC (permalink / raw) To: frederic.fort; +Cc: caml-list [-- Attachment #1: Type: text/plain, Size: 5401 bytes --] Hi Frederic, You're observing a glimpse of an undefined behavior that occurs when OCaml runtime reloads a compilation unit that is already loaded. It is a well-known bug (MPR#4208, MPR#4229, MPR#4839, MPR#6462, MPR#6957, MPR#6950) which is not yet fixed [1]. In a luckier general case, it will lead to a segmentation fault. But sometimes, it may just flip some bits, turn true into false, and ... happy debugging. The crux of the problem is that the runtime not only reevaluates OCaml values, but it also resets the roots table and other runtime data structures which breaks GC invariants and sends it on a rampage on your data. That is not to say that you can't load code dynamically in OCaml. You can, and we do this successfully in BAP, which uses plugins very extensively. It just means that you can't trust the runtime and hope that it will take care of the correctness and need to ensure it yourself. Basically, that means that your loader must track which compilation units are already loaded, and your plugin must contain meta information that tells the loader which compilation units it requires and which it provides. This requires quite a cooperation from all the parts. In BAP we solved it in the following case: 1) Developed a `bapbuild` tool which is an ocamlbuild enhanced with a plugin [2] that knows how to build `*.plugin` files. A plugin is a zip file underneath the hood with a fixed layout (called bundle in our parlance). It contains a MANIFEST file which includes the list of required libraries and a list of provided units, along with some meta information and, of course, the cmxs (and cma) for the code itself. Optionally, the bundle may include all the dependent libraries (to make the plugin loadable in environments where the required libraries are not provided). The `bapbuild` tool will package all the dependencies by default, and since some libraries in the OPAM universe do not provide `cmxs` at all it will also build cmxs for them and package them into the plugin. Note, 2) Developed a `bap_plugins` runtime library [3] which loads plugins, fulfilling their dependencies and ensuring that no units are loaded twice. 3) The host program (which loads plugins) may (and will) also contain some compilation units in it, as it will be linked from some set of compilation units that are either local to the project or came from external libraries. So we need some cooperation from the build system that shall tell us which units are already loaded (alternatively we can parse the ELF structures of the host binary, but this doesn't sound as a very portable and robust solution). We use `ocamlfind.dynlink` library which enables such cooperation, by storing a list of libraries and packages that were used to build a binary in an internal data structure. We wrote a small ocamlbuild plugin [4] that enables this and the rest is done by ocamlfind (which actually generates a file and links it into the host binary). Everything is under MIT license so feel free to use it at your wish. Besides having the bap prefix those tools are pretty independent and could be generalized with all bapspecificness scrapped away. Best wishes, Ivan Gotovchits [1]: https://github.com/ocaml/ocaml/pull/1063 [2]: https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_build/bap_build.ml [3] https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_plugins/bap_plugins.ml [4] https://github.com/BinaryAnalysisPlatform/bap/blob/master/myocamlbuild.ml.in#L41-L85 On Thu, Nov 8, 2018 at 5:05 AM Frédéric Fort <frederic.fort@univ-lille.fr> wrote: > Hello, > > I have an existing program and would like to allow to extend it's > functionalities with plugins. > If I simplify my code structure it looks as follows: > - a.ml : "main module" of the program > - b.ml : additional definitions used in a.ml > - c.ml : interface for plugins (a collection of function refs) > - d.ml : plugin I would like to load > > Now, d.ml uses values defined in b.ml. Some of them are of type string ref > and it seems that the code of b.ml is reevaluated when I call > Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty string. > > Is there a way to prevent this from happening ? > Using allow_only and prohibit is not an option, since multiple plugins > would each reevaluate C > and undo each others modifications. > > Yours sincerely, > Frédéric Fort > > P.S.: Here follows a minimal working example. > I compiled it with > ocamlbuild -use-ocamlfind -lib dynlink a.native > ocamlbuild -use-ocamlfind d.cmxs > > a.ml: > open Format > > let _ = > B.str := "abc"; > printf "%s\n" !B.str; > begin > try > Dynlink.loadfile "./_build/d.cmxs" > with Dynlink.Error err -> > failwith (Dynlink.error_message err) end; > printf "%s\n" !B.str; > match !C.f with > | Some(f) -> printf "%s\n" (f 0) > | None -> () > > b.ml: > let str = ref "" > > c.ml: > let f : (int -> string) option ref = ref None > > d.ml: > let _ = > C.f := Some((fun x -> !B.str^(string_of_int x))) > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > https://inbox.ocaml.org/caml-list > Forum: https://discuss.ocaml.org/ > Bug reports: http://caml.inria.fr/bin/caml-bugs [-- Attachment #2: Type: text/html, Size: 7628 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Dynlink plugin reevaluates modules of main program 2018-11-08 13:02 ` Ivan Gotovchits @ 2018-11-15 16:35 ` Nicolás Ojeda Bär 0 siblings, 0 replies; 5+ messages in thread From: Nicolás Ojeda Bär @ 2018-11-15 16:35 UTC (permalink / raw) To: ivg; +Cc: frederic.fort, OCaml Mailing List Dear list, As a follow-up to Ivan's informative email, I am happy to report that the PR [1] mentioned in his email has been merged, so the fix to this long-standing bug will be included in 4.08 when it is released. Best wishes, Nicolás On Thu, Nov 8, 2018 at 2:04 PM Ivan Gotovchits <ivg@ieee.org> wrote: > > Hi Frederic, > > You're observing a glimpse of an undefined behavior that occurs when OCaml runtime reloads a compilation unit that is already loaded. It is a well-known bug (MPR#4208, MPR#4229, MPR#4839, MPR#6462, MPR#6957, MPR#6950) which is not yet fixed [1]. In a luckier general case, it will lead to a segmentation fault. But sometimes, it may just flip some bits, turn true into false, and ... happy debugging. The crux of the problem is that the runtime not only reevaluates OCaml values, but it also resets the roots table and other runtime data structures which breaks GC invariants and sends it on a rampage on your data. > > That is not to say that you can't load code dynamically in OCaml. You can, and we do this successfully in BAP, which uses plugins very extensively. It just means that you can't trust the runtime and hope that it will take care of the correctness and need to ensure it yourself. Basically, that means that your loader must track which compilation units are already loaded, and your plugin must contain meta information that tells the loader which compilation units it requires and which it provides. This requires quite a cooperation from all the parts. In BAP we solved it in the following case: > > 1) Developed a `bapbuild` tool which is an ocamlbuild enhanced with a plugin [2] that knows how to build `*.plugin` files. A plugin is a zip file underneath the hood with a fixed layout (called bundle in our parlance). It contains a MANIFEST file which includes the list of required libraries and a list of provided units, along with some meta information and, of course, the cmxs (and cma) for the code itself. Optionally, the bundle may include all the dependent libraries (to make the plugin loadable in environments where the required libraries are not provided). The `bapbuild` tool will package all the dependencies by default, and since some libraries in the OPAM universe do not provide `cmxs` at all it will also build cmxs for them and package them into the plugin. Note, > > 2) Developed a `bap_plugins` runtime library [3] which loads plugins, fulfilling their dependencies and ensuring that no units are loaded twice. > > 3) The host program (which loads plugins) may (and will) also contain some compilation units in it, as it will be linked from some set of compilation units that are either local to the project or came from external libraries. So we need some cooperation from the build system that shall tell us which units are already loaded (alternatively we can parse the ELF structures of the host binary, but this doesn't sound as a very portable and robust solution). We use `ocamlfind.dynlink` library which enables such cooperation, by storing a list of libraries and packages that were used to build a binary in an internal data structure. We wrote a small ocamlbuild plugin [4] that enables this and the rest is done by ocamlfind (which actually generates a file and links it into the host binary). > > Everything is under MIT license so feel free to use it at your wish. Besides having the bap prefix those tools are pretty independent and could be generalized with all bapspecificness scrapped away. > > Best wishes, > Ivan Gotovchits > > > [1]: https://github.com/ocaml/ocaml/pull/1063 > [2]: https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_build/bap_build.ml > [3] https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_plugins/bap_plugins.ml > [4] https://github.com/BinaryAnalysisPlatform/bap/blob/master/myocamlbuild.ml.in#L41-L85 > > On Thu, Nov 8, 2018 at 5:05 AM Frédéric Fort <frederic.fort@univ-lille.fr> wrote: >> >> Hello, >> >> I have an existing program and would like to allow to extend it's functionalities with plugins. >> If I simplify my code structure it looks as follows: >> - a.ml : "main module" of the program >> - b.ml : additional definitions used in a.ml >> - c.ml : interface for plugins (a collection of function refs) >> - d.ml : plugin I would like to load >> >> Now, d.ml uses values defined in b.ml. Some of them are of type string ref >> and it seems that the code of b.ml is reevaluated when I call >> Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty string. >> >> Is there a way to prevent this from happening ? >> Using allow_only and prohibit is not an option, since multiple plugins would each reevaluate C >> and undo each others modifications. >> >> Yours sincerely, >> Frédéric Fort >> >> P.S.: Here follows a minimal working example. >> I compiled it with >> ocamlbuild -use-ocamlfind -lib dynlink a.native >> ocamlbuild -use-ocamlfind d.cmxs >> >> a.ml: >> open Format >> >> let _ = >> B.str := "abc"; >> printf "%s\n" !B.str; >> begin >> try >> Dynlink.loadfile "./_build/d.cmxs" >> with Dynlink.Error err -> >> failwith (Dynlink.error_message err) end; >> printf "%s\n" !B.str; >> match !C.f with >> | Some(f) -> printf "%s\n" (f 0) >> | None -> () >> >> b.ml: >> let str = ref "" >> >> c.ml: >> let f : (int -> string) option ref = ref None >> >> d.ml: >> let _ = >> C.f := Some((fun x -> !B.str^(string_of_int x))) >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/arc/caml-list https://inbox.ocaml.org/caml-list >> Forum: https://discuss.ocaml.org/ >> Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-11-15 16:35 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-11-08 10:05 [Caml-list] Dynlink plugin reevaluates modules of main program Frédéric Fort 2018-11-08 10:23 ` Nicolás Ojeda Bär 2018-11-08 10:32 ` Gabriel Scherer 2018-11-08 13:02 ` Ivan Gotovchits 2018-11-15 16:35 ` Nicolás Ojeda Bär
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox