* [Caml-list] The verdict on "%identity" @ 2012-11-19 17:49 Dario Teixeira 2012-11-19 18:02 ` Török Edwin 0 siblings, 1 reply; 9+ messages in thread From: Dario Teixeira @ 2012-11-19 17:49 UTC (permalink / raw) To: OCaml mailing-list Hi, I've found conflicting information regarding the use of "%identity", which I hope to see clarified. Let's consider a typical example where a module defines an abstract type t and provides (de)serialisation functions of_string/to_string. Moreover, the actual implementation of t uses a string, and the (de)serialisation functions are just identities: module Foo: sig type t val of_string: string -> t val to_string: t -> string end = struct type t = string let of_string x = x let to_string x = x end In practice, it's not unusual for such code to be implemented using the compiler's "%identity" builtin, all in the name of performance: module Foo: sig type t external of_string: string -> t = "%identity" external to_string: t -> string = "%identity" end = struct type t = string external of_string: string -> t = "%identity" external to_string: t -> string = "%identity" end I realise that the use of "%identity" is dangerous. This is, after all, how Obj.magic is defined. Moreover, it uglifies interface definitions and makes a ridicule of the abstraction. However, on the assumption that ocamlopt won't otherwise optimise away the no-op across module boundaries, the use of "%identity" may well be justified for performance reasons. With all the above in mind, I have two questions: 1) Is the assumption correct that today's ocamlopt won't optimise no-ops across module boundaries? (I know that ocamlopt does not generally engage in MLton-style whole programme optimisation, but is this also true for low-hanging fruit such as the first example above?) 2) Consider the code below. For which modules can one expect of_string calls to be optimised across module boundaries? module type SIG1 = sig type t val of_string: string -> t end module type SIG2 = sig type t external of_string: string -> t = "%identity" end module Impl1 = struct type t = string let of_string x = x end module Impl2 = struct type t = string external of_string: string -> t = "%identity" end module A: SIG1 = Impl1 module B: SIG1 = Impl2 module C: SIG2 = Impl1 module D: SIG2 = Impl2 Thank you in advance for your time! Best regards, Dario Teixeira ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-19 17:49 [Caml-list] The verdict on "%identity" Dario Teixeira @ 2012-11-19 18:02 ` Török Edwin 2012-11-19 18:18 ` Dario Teixeira 0 siblings, 1 reply; 9+ messages in thread From: Török Edwin @ 2012-11-19 18:02 UTC (permalink / raw) To: caml-list On 11/19/2012 07:49 PM, Dario Teixeira wrote: > Hi, > > > I've found conflicting information regarding the use of "%identity", > which I hope to see clarified. > > Let's consider a typical example where a module defines an abstract > type t and provides (de)serialisation functions of_string/to_string. > Moreover, the actual implementation of t uses a string, and the > (de)serialisation functions are just identities: > > module Foo: > sig > type t > > val of_string: string -> t > val to_string: t -> string > end = > struct > type t = string > > let of_string x = x > let to_string x = x > end > > > In practice, it's not unusual for such code to be implemented using > the compiler's "%identity" builtin, all in the name of performance: > > module Foo: > sig > type t Wouldn't 'type t = private string' help the compiler optimize this? > > external of_string: string -> t = "%identity" > external to_string: t -> string = "%identity" > end = > struct > type t = string > > external of_string: string -> t = "%identity" > external to_string: t -> string = "%identity" > end > > > I realise that the use of "%identity" is dangerous. This is, after all, > how Obj.magic is defined. Moreover, it uglifies interface definitions > and makes a ridicule of the abstraction. However, on the assumption that > ocamlopt won't otherwise optimise away the no-op across module boundaries, > the use of "%identity" may well be justified for performance reasons. > > With all the above in mind, I have two questions: > > 1) Is the assumption correct that today's ocamlopt won't optimise no-ops > across module boundaries? (I know that ocamlopt does not generally engage > in MLton-style whole programme optimisation, but is this also true for > low-hanging fruit such as the first example above?) > > 2) Consider the code below. For which modules can one expect of_string calls > to be optimised across module boundaries? > > module type SIG1 = sig type t val of_string: string -> t end > module type SIG2 = sig type t external of_string: string -> t = "%identity" end > > module Impl1 = struct type t = string let of_string x = x end > module Impl2 = struct type t = string external of_string: string -> t = "%identity" end > > module A: SIG1 = Impl1 > module B: SIG1 = Impl2 > module C: SIG2 = Impl1 > module D: SIG2 = Impl2 > > Thank you in advance for your time! > Best regards, > Dario Teixeira > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-19 18:02 ` Török Edwin @ 2012-11-19 18:18 ` Dario Teixeira 2012-11-19 18:28 ` David House 0 siblings, 1 reply; 9+ messages in thread From: Dario Teixeira @ 2012-11-19 18:18 UTC (permalink / raw) To: Török Edwin, caml-list Hi, > Wouldn't 'type t = private string' help the compiler optimize this? Possibly, though the semantics would change: what before was an abstract type is now translucent (ie, not quite transparent). Regards, Dario ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-19 18:18 ` Dario Teixeira @ 2012-11-19 18:28 ` David House 2012-11-20 9:53 ` Gabriel Scherer 2012-11-20 10:25 ` Pierre Chambart 0 siblings, 2 replies; 9+ messages in thread From: David House @ 2012-11-19 18:28 UTC (permalink / raw) To: Dario Teixeira; +Cc: Török Edwin, caml-list If you wanted to investigate this yourself, you could compile with -S and look at the generated assembly. For such short functions, this is generally not very hard. On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira <darioteixeira@yahoo.com> wrote: > Hi, > >> Wouldn't 'type t = private string' help the compiler optimize this? > > > Possibly, though the semantics would change: what before was > an abstract type is now translucent (ie, not quite transparent). > > Regards, > Dario > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-19 18:28 ` David House @ 2012-11-20 9:53 ` Gabriel Scherer 2012-11-20 10:25 ` Pierre Chambart 1 sibling, 0 replies; 9+ messages in thread From: Gabriel Scherer @ 2012-11-20 9:53 UTC (permalink / raw) To: David House; +Cc: Dario Teixeira, Török Edwin, caml-list On Mon, Nov 19, 2012 at 7:28 PM, David House <dhouse@janestreet.com> wrote: > If you wanted to investigate this yourself, you could compile with -S > and look at the generated assembly. For such short functions, this is > generally not very hard. Indeed: cat test.ml module type SIG1 = sig type t val of_string: string -> t end module type SIG2 = sig type t external of_string: string -> t = "%identity" end module Impl1 = struct type t = string let of_string x = x end module Impl2 = struct type t = string external of_string: string -> t = "%identity" end module A: SIG1 = Impl1 module B: SIG1 = Impl2 (* module C: SIG2 = Impl1 *) module D: SIG2 = Impl2 let testA = A.of_string "foo" let testB = B.of_string "bar" let testD = D.of_string "baz" (I commented C out because it makes no sense to me, semantically, and it's rejected by the compiler.) ocamlopt -c -S test.ml less test.s The (relevant part of the) result on my machine, that correspond to compilation of testA, testB, testD: camlTest__entry: [...] movl $camlTest__3, %eax movl %eax, camlTest + 20 movl $camlTest__2, %eax movl %eax, camlTest + 24 movl $camlTest__1, %eax movl %eax, camlTest + 28 [...] All compiled in the same way. There may be a difference for calls across compilation units: in absence of the .cmx, no inlining would be performed. My guess would be that in presence of the .cmx we should get the same final result, but I must say I don't really care for performances on this front. Note that there is however an important difference with private definitions: with private, the cast from t to string is not only erased by the compiler, it is a *coercion* that can be lifted to casts to larger datatypes. You can coerce a (list t) into a (list string) and this is also a no-op in the dynamic semantics. That's much stronger than what you get from %identity. (This suggest that, with the explicit subtyping we have in OCaml, there would be a case for inter-coercible types: a way to define a t such that for example (t :> string) and (string :> t), but not (t = string). This doesn't increase the type safety of arbitrary programs but allow programmers to force abstraction-breaking to be explicit, with no performance cost in both directions.) > > On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira <darioteixeira@yahoo.com> wrote: >> Hi, >> >>> Wouldn't 'type t = private string' help the compiler optimize this? >> >> >> Possibly, though the semantics would change: what before was >> an abstract type is now translucent (ie, not quite transparent). >> >> Regards, >> Dario >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/arc/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-19 18:28 ` David House 2012-11-20 9:53 ` Gabriel Scherer @ 2012-11-20 10:25 ` Pierre Chambart 2012-11-20 16:19 ` Gabriel Scherer 1 sibling, 1 reply; 9+ messages in thread From: Pierre Chambart @ 2012-11-20 10:25 UTC (permalink / raw) To: caml-list To know what will be generated after inlining I prefer to use -dcmm, which is closer to the original code than the assembly and still show show inlining result (and if you are using the svn trunk, you can use -dclambda which show a higher level code, without allocations and boxing). Inlining of OCaml functions works the same way inside a module or cross-module. It looks at the function size and if it smaller than a certain threshold, it will be inlined. To know if it will happen to your function, look at the result of ocamlobjinfo. for instance: module.mli: type t external id_prim : string -> t = "%identity" val id : string -> t val f : int -> int val g : int -> int module.ml: type t = string external id_prim : 'a -> 'a = "%identity" let id x = x let f x = x + x + x + x let g x = x + x + x + x + x + x + x + x ocamlobjinfo module.cmx: ... Approximation: (0: function camlIdentity__id_1010 arity 1 (closed) (inline) -> _; 1: function camlIdentity__f_1012 arity 1 (closed) (inline) -> _; 2: function camlIdentity__g_1014 arity 1 (closed) -> _) ... the function id and f will be inlined whatever the context of the call is, but g won't be. If you want a function to be inlined, you can use the -inline option of ocamlopt to increase the maximum size of inlined functions in the module. Notice that recursive functions can't be inlined. The usage of private type is different. When using generic comparison/equality/hash/set in an array, the compiler generate an optimised code when the type is known to be one of the fast cases: module M1 : sig type t = private int end = struct type t = int end module M2 : sig type t end = struct type t = int end let a x y = x > y let b (x:int) y = x > y let c (x:M1.t) y = x > y let d (x:M2.t) y = x > y the result of ocamlopt -dcmm: (function camlCompare__a_1014 (x/1015: addr y/1016: addr) (extcall "caml_greaterthan" x/1015 y/1016 addr)) (function camlCompare__b_1017 (x/1018: addr y/1019: addr) (+ (<< (> x/1018 y/1019) 1) 1)) (function camlCompare__c_1020 (x/1021: addr y/1022: addr) (+ (<< (> x/1021 y/1022) 1) 1)) (function camlCompare__d_1023 (x/1024: addr y/1025: addr) (extcall "caml_greaterthan" x/1024 y/1025 addr)) Here b and c will be a lot faster than a and d. Using private type allows to keep those informations acros modules. -- Pierre Le Mon, 19 Nov 2012 18:28:32 +0000, David House <dhouse@janestreet.com> wrote : > If you wanted to investigate this yourself, you could compile with -S > and look at the generated assembly. For such short functions, this is > generally not very hard. > > On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira > <darioteixeira@yahoo.com> wrote: > > Hi, > > > >> Wouldn't 'type t = private string' help the compiler optimize this? > > > > > > Possibly, though the semantics would change: what before was > > an abstract type is now translucent (ie, not quite transparent). > > > > Regards, > > Dario > > > > -- > > Caml-list mailing list. Subscription management and archives: > > https://sympa.inria.fr/sympa/arc/caml-list > > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > > Bug reports: http://caml.inria.fr/bin/caml-bugs > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-20 10:25 ` Pierre Chambart @ 2012-11-20 16:19 ` Gabriel Scherer 2012-11-20 19:03 ` Vincent HUGOT 2012-11-20 20:43 ` Dario Teixeira 0 siblings, 2 replies; 9+ messages in thread From: Gabriel Scherer @ 2012-11-20 16:19 UTC (permalink / raw) To: Pierre Chambart; +Cc: caml-list This is good advice in general, and using ocamlobjinfo to get inlining information from the .cmx is indeed a very good idea. Regarding -dcmm vs. -S, I generally use -dcmm myself (much more readable), but it is not the right tool in this case. The cmm produced for my test.ml code does make a difference between the three styles: (let testA/1046 (let x/1075 "camlTest__3" x/1075) (store (+a "camlTest" 20) testA/1046)) (let testB/1047 (let prim/1076 "camlTest__2" prim/1076) (store (+a "camlTest" 24) testB/1047)) (let testD/1048 "camlTest__1" (store (+a "camlTest" 28) testD/1048)) In fact, the removal of the trivial (let x = foo in x) does not happen during the inlining passes in closure.ml, but much later at the register allocation phase, where there is indeed a strong preference for eg. testA/1046 and x/1705 to be given the same register, and the useless move is erased. I make no claim of how robust this behavior will be in a different case (eg. with higher register pressure), but I'm not sure I really care. I'd rather have people study the behavior on the compiler the real performance-critical applications and suggest potential style changes in the program (or optimization changes in the compiler) in cases where this really make a performance difference. Writing code in a certain way because "the generated code is nicer" is usually not worth the trouble. On Tue, Nov 20, 2012 at 11:25 AM, Pierre Chambart <pierre.chambart@ocamlpro.com> wrote: > To know what will be generated after inlining I prefer to use -dcmm, > which is closer to the original code than the assembly and still show > show inlining result (and if you are using the svn trunk, you can use > -dclambda which show a higher level code, without allocations and > boxing). > > Inlining of OCaml functions works the same way inside a module or > cross-module. It looks at the function size and if it smaller than > a certain threshold, it will be inlined. To know if it will happen to > your function, look at the result of ocamlobjinfo. > > for instance: > module.mli: > > type t > external id_prim : string -> t = "%identity" > val id : string -> t > val f : int -> int > val g : int -> int > > module.ml: > > type t = string > external id_prim : 'a -> 'a = "%identity" > let id x = x > let f x = x + x + x + x > let g x = x + x + x + x + x + x + x + x > > ocamlobjinfo module.cmx: > > ... > Approximation: > (0: function camlIdentity__id_1010 arity 1 (closed) (inline) -> _; > 1: function camlIdentity__f_1012 arity 1 (closed) (inline) -> _; > 2: function camlIdentity__g_1014 arity 1 (closed) -> _) > ... > > the function id and f will be inlined whatever the context of the call > is, but g won't be. > > If you want a function to be inlined, you can use the -inline option of > ocamlopt to increase the maximum size of inlined functions in the > module. Notice that recursive functions can't be inlined. > > The usage of private type is different. > When using generic comparison/equality/hash/set in an array, the > compiler generate an optimised code when the type is known to be one of > the fast cases: > > module M1 : sig > type t = private int > end = struct type t = int end > module M2 : sig > type t > end = struct type t = int end > > let a x y = x > y > let b (x:int) y = x > y > let c (x:M1.t) y = x > y > let d (x:M2.t) y = x > y > > the result of ocamlopt -dcmm: > > (function camlCompare__a_1014 (x/1015: addr y/1016: addr) > (extcall "caml_greaterthan" x/1015 y/1016 addr)) > > (function camlCompare__b_1017 (x/1018: addr y/1019: addr) > (+ (<< (> x/1018 y/1019) 1) 1)) > > (function camlCompare__c_1020 (x/1021: addr y/1022: addr) > (+ (<< (> x/1021 y/1022) 1) 1)) > > (function camlCompare__d_1023 (x/1024: addr y/1025: addr) > (extcall "caml_greaterthan" x/1024 y/1025 addr)) > > Here b and c will be a lot faster than a and d. > Using private type allows to keep those informations acros modules. > -- > Pierre > > Le Mon, 19 Nov 2012 18:28:32 +0000, > David House <dhouse@janestreet.com> wrote : > >> If you wanted to investigate this yourself, you could compile with -S >> and look at the generated assembly. For such short functions, this is >> generally not very hard. >> >> On Mon, Nov 19, 2012 at 6:18 PM, Dario Teixeira >> <darioteixeira@yahoo.com> wrote: >> > Hi, >> > >> >> Wouldn't 'type t = private string' help the compiler optimize this? >> > >> > >> > Possibly, though the semantics would change: what before was >> > an abstract type is now translucent (ie, not quite transparent). >> > >> > Regards, >> > Dario >> > >> > -- >> > Caml-list mailing list. Subscription management and archives: >> > https://sympa.inria.fr/sympa/arc/caml-list >> > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> > Bug reports: http://caml.inria.fr/bin/caml-bugs >> > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-20 16:19 ` Gabriel Scherer @ 2012-11-20 19:03 ` Vincent HUGOT 2012-11-20 20:43 ` Dario Teixeira 1 sibling, 0 replies; 9+ messages in thread From: Vincent HUGOT @ 2012-11-20 19:03 UTC (permalink / raw) To: caml-list Is there some place where this fabulous -dmcc switch is documented? ocamlopt's man and --help pages, as well as the manual, are utterly uninformative (either don't mention it or say simply "undocumented"). V. On Tue, 20 Nov 2012 17:19:34 +0100, Gabriel Scherer <gabriel.scherer@gmail.com> wrote: > This is good advice in general, and using ocamlobjinfo to get inlining > information from the .cmx is indeed a very good idea. > > Regarding -dcmm vs. -S, I generally use -dcmm myself (much more > readable), but it is not the right tool in this case. The cmm produced > for my test.ml code does make a difference between the three styles: > (let testA/1046 (let x/1075 "camlTest__3" x/1075) > (store (+a "camlTest" 20) testA/1046)) > (let testB/1047 (let prim/1076 "camlTest__2" prim/1076) > (store (+a "camlTest" 24) testB/1047)) > (let testD/1048 "camlTest__1" > (store (+a "camlTest" 28) testD/1048)) > > In fact, the removal of the trivial (let x = foo in x) does not happen > during the inlining passes in closure.ml, but much later at the > register allocation phase, where there is indeed a strong preference > for eg. testA/1046 and x/1705 to be given the same register, and the > useless move is erased. I make no claim of how robust this behavior > will be in a different case (eg. with higher register pressure), but > I'm not sure I really care. > I'd rather have people study the behavior on the compiler the real > performance-critical applications and suggest potential style changes > in the program (or optimization changes in the compiler) in cases > where this really make a performance difference. Writing code in a > certain way because "the generated code is nicer" is usually not worth > the trouble. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Caml-list] The verdict on "%identity" 2012-11-20 16:19 ` Gabriel Scherer 2012-11-20 19:03 ` Vincent HUGOT @ 2012-11-20 20:43 ` Dario Teixeira 1 sibling, 0 replies; 9+ messages in thread From: Dario Teixeira @ 2012-11-20 20:43 UTC (permalink / raw) To: Gabriel Scherer, Pierre Chambart; +Cc: caml-list Hi, And thank you, Gabriel and Pierre, for your insights. > I'd rather have people study the behavior on the compiler the real > performance-critical applications and suggest potential style changes > in the program (or optimization changes in the compiler) in cases > where this really make a performance difference. Writing code in a > certain way because "the generated code is nicer" is usually not worth > the trouble. Mind you, I'm not especially fond of such low-level trickery myself, particularly when the trick can cause a segfault if used carelessly, as is the case of "%identity". On the other hand, it's always good to have these tricks in the back of your mind -- you never know when they might come in handy... Best regards, Dario Teixeira ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-11-20 20:43 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-11-19 17:49 [Caml-list] The verdict on "%identity" Dario Teixeira 2012-11-19 18:02 ` Török Edwin 2012-11-19 18:18 ` Dario Teixeira 2012-11-19 18:28 ` David House 2012-11-20 9:53 ` Gabriel Scherer 2012-11-20 10:25 ` Pierre Chambart 2012-11-20 16:19 ` Gabriel Scherer 2012-11-20 19:03 ` Vincent HUGOT 2012-11-20 20:43 ` Dario Teixeira
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox