Hash clash in polymorphic variants

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* Hash clash in polymorphic variants
@ 2008-01-10 17:09 Jon Harrop
  2008-01-10 20:35 ` [Caml-list] " Eric Cooper
  2008-01-11  0:15 ` [Caml-list] " Jacques Garrigue
  0 siblings, 2 replies; 37+ messages in thread
From: Jon Harrop @ 2008-01-10 17:09 UTC (permalink / raw)
  To: caml-list


ISTR advice that constructors sharing the first few characters should be 
avoided in order to reduce the likelihood of clashing hash values for 
polymorphic variants. Is that right?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Hash clash in polymorphic variants
  2008-01-10 17:09 Hash clash in polymorphic variants Jon Harrop
@ 2008-01-10 20:35 ` Eric Cooper
  2008-01-10 21:24   ` Jon Harrop
  2008-01-11  0:15 ` [Caml-list] " Jacques Garrigue
  1 sibling, 1 reply; 37+ messages in thread
From: Eric Cooper @ 2008-01-10 20:35 UTC (permalink / raw)
  To: caml-list

On Thu, Jan 10, 2008 at 05:09:13PM +0000, Jon Harrop wrote:
> ISTR advice that constructors sharing the first few characters should be 
> avoided in order to reduce the likelihood of clashing hash values for 
> polymorphic variants. Is that right?

I don't think it's worth worrying about.

I wrote a program a while ago to look into this.  I never saw any
"human-sensible" collisions (between two identifiers that a person
might have chosen). And if you're producing gensyms in a program, you
can just check ahead of time.

To find a collision with a given identifier, consider each bignum N
that differs by a multiple of 2^31 from the identifier's hash value.
Compute the radix-223 representation of N.  If that forms a legal
OCaml identifier, then you've found a collision.

For example, Eric_Cooper collides with azdwbie, c7diagq, hlChrkt,
NSaServ, and SaupDOF, to pick just a few.

-- 
Eric (call me SaupDOF) Cooper             e c c @ c m u . e d u

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Hash clash in polymorphic variants
  2008-01-10 20:35 ` [Caml-list] " Eric Cooper
@ 2008-01-10 21:24   ` Jon Harrop
  2008-01-10 21:40     ` David Allsopp
  0 siblings, 1 reply; 37+ messages in thread
From: Jon Harrop @ 2008-01-10 21:24 UTC (permalink / raw)
  To: caml-list

On Thursday 10 January 2008 20:35:34 Eric Cooper wrote:
> On Thu, Jan 10, 2008 at 05:09:13PM +0000, Jon Harrop wrote:
> > ISTR advice that constructors sharing the first few characters should be
> > avoided in order to reduce the likelihood of clashing hash values for
> > polymorphic variants. Is that right?
>
> I don't think it's worth worrying about.
>
> I wrote a program a while ago to look into this.  I never saw any
> "human-sensible" collisions (between two identifiers that a person
> might have chosen). And if you're producing gensyms in a program, you
> can just check ahead of time.

I'm interested in automatically translating the GL_* enum from OpenGL into 
polymorphic variants. So although it is generated code I have little control 
over it, e.g. I cannot change the translation as OpenGL gets extended because 
code will already be using the existing names.

Still, maybe I'm over-reacting. ;-)

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [Caml-list] Hash clash in polymorphic variants
  2008-01-10 21:24   ` Jon Harrop
@ 2008-01-10 21:40     ` David Allsopp
  2008-01-11 13:30       ` Kuba Ober
  0 siblings, 1 reply; 37+ messages in thread
From: David Allsopp @ 2008-01-10 21:40 UTC (permalink / raw)
  To: caml-list

Jon Harrop wrote:
> On Thursday 10 January 2008 20:35:34 Eric Cooper wrote:
> > On Thu, Jan 10, 2008 at 05:09:13PM +0000, Jon Harrop wrote:
> > > ISTR advice that constructors sharing the first few characters should
> > > be avoided in order to reduce the likelihood of clashing hash values
> > > for polymorphic variants. Is that right?
> >
> > I don't think it's worth worrying about.
> >
> > I wrote a program a while ago to look into this.  I never saw any
> > "human-sensible" collisions (between two identifiers that a person
> > might have chosen). And if you're producing gensyms in a program, you
> > can just check ahead of time.
>
> I'm interested in automatically translating the GL_* enum from OpenGL into

> polymorphic variants. So although it is generated code I have little
> control over it, e.g. I cannot change the translation as OpenGL gets
> extended because 
> code will already be using the existing names.
>
> Still, maybe I'm over-reacting. ;-)

I presume you're worried about the bindings clashing internally rather than
someone who uses the library happening to use a variant that clashes?

You can do something about it - when you're generating your bindings, you
can use the hash_variant() C function to detect the collisions yourself. If
you detect one, you can either issue *your own* warning while generating the
bindings allowing you to specify specific renaming for the program
generating your bindings or you could append digits to the names until the
collisions disappear (which is likely, though not guaranteed, to happen
quickly).

It's slightly ugly, but then the possibility of collisions in the first
place is IMHO ugly too!

David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Hash clash in polymorphic variants
  2008-01-10 21:40     ` David Allsopp
@ 2008-01-11 13:30       ` Kuba Ober
  2008-01-11 13:48         ` Jon Harrop
  0 siblings, 1 reply; 37+ messages in thread
From: Kuba Ober @ 2008-01-11 13:30 UTC (permalink / raw)
  To: caml-list

> > > > ISTR advice that constructors sharing the first few characters should
> > > > be avoided in order to reduce the likelihood of clashing hash values
> > > > for polymorphic variants. Is that right?
> > >
> > > I don't think it's worth worrying about.
> >
> > I'm interested in automatically translating the GL_* enum from OpenGL
> > into
> > polymorphic variants. So although it is generated code I have little
>
> I presume you're worried about the bindings clashing internally rather than
> someone who uses the library happening to use a variant that clashes?
>
> You can do something about it - when you're generating your bindings, you
> can use the hash_variant() C function to detect the collisions yourself. If
> you detect one, you can either issue *your own* warning while generating
> the bindings allowing you to specify specific renaming for the program
> generating your bindings or you could append digits to the names until the
> collisions disappear (which is likely, though not guaranteed, to happen
> quickly).
>
> It's slightly ugly, but then the possibility of collisions in the first
> place is IMHO ugly too!

Are those collisions of any real importance? I mean, do they break anything? 
If all they do is imply linearly searching a list of a few elements, for the 
colliding entry, then it's a non-issue?

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Hash clash in polymorphic variants
  2008-01-11 13:30       ` Kuba Ober
@ 2008-01-11 13:48         ` Jon Harrop
  2008-01-11 16:14           ` Kuba Ober
  0 siblings, 1 reply; 37+ messages in thread
From: Jon Harrop @ 2008-01-11 13:48 UTC (permalink / raw)
  To: caml-list

On Friday 11 January 2008 13:30:29 Kuba Ober wrote:
> Are those collisions of any real importance? I mean, do they break
> anything? If all they do is imply linearly searching a list of a few
> elements, for the colliding entry, then it's a non-issue?

It would prevent code from compiling so it would be a complete show-stopper.

In this case, there is a chance that a hash clash in names that I have no 
control over would break my OpenGL bindings at some point in the future.

A theoretical solution would be to grow the bindings and avoid clashes in 
identifiers included in later versions of OpenGL by adding random suffixes. 
Although this works in theory, in practice it places the burden of a linear 
search on the programmer who must then sift through the bindings to find out 
if the identifier they want to use happens to have had an internal clash in 
my bindings and, therefore, would require them to use a different identifier.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Hash clash in polymorphic variants
  2008-01-11 13:48         ` Jon Harrop
@ 2008-01-11 16:14           ` Kuba Ober
  2008-01-11 18:40             ` David Allsopp
  0 siblings, 1 reply; 37+ messages in thread
From: Kuba Ober @ 2008-01-11 16:14 UTC (permalink / raw)
  To: caml-list

On Friday 11 January 2008, Jon Harrop wrote:
> On Friday 11 January 2008 13:30:29 Kuba Ober wrote:
> > Are those collisions of any real importance? I mean, do they break
> > anything? If all they do is imply linearly searching a list of a few
> > elements, for the colliding entry, then it's a non-issue?
>
> It would prevent code from compiling so it would be a complete
> show-stopper.

So what you're saying is that the implementation uses the hash with bucket 
size of 1? That's kinda poor decision, methinks.

Maybe perfect hashes should be used, computed at link time (and at runtime 
whenever a module is linked in). The pefect hashing function could probably 
implement some sort of a table, so that no real code would need to be
generated, just recomputing of decision tree table. Gperf code could be
adapted for that. The benefit is that there would be no collisions, the hashed
data structure would be very compact, and the cost to regenerate the hash is
amortized. Ideally, one would generate the actual perfect hashing function,
but this is currently only possible in bytecode, right? I mean, toplevel won't
run in native code? Or am I mistaken?

Kuba

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [Caml-list] Hash clash in polymorphic variants
  2008-01-11 16:14           ` Kuba Ober
@ 2008-01-11 18:40             ` David Allsopp
  2008-01-14 12:20               ` Kuba Ober
  0 siblings, 1 reply; 37+ messages in thread
From: David Allsopp @ 2008-01-11 18:40 UTC (permalink / raw)
  To: caml-list

Kuba Ober wrote:
> On Friday 11 January 2008, Jon Harrop wrote:
> > On Friday 11 January 2008 13:30:29 Kuba Ober wrote:
> > > Are those collisions of any real importance? I mean, do they break
> > > anything? If all they do is imply linearly searching a list of a few
> > > elements, for the colliding entry, then it's a non-issue?
> >
> > It would prevent code from compiling so it would be a complete
> > show-stopper.
>
> So what you're saying is that the implementation uses the hash with bucket

> size of 1? That's kinda poor decision, methinks.

I think you're missing the context - there's no hash table. See 18.3.6 in
the manual - the hashed values (and resulting collisions) are to do with the
internal representation of polymorphic variants.

The compiler cannot process code that uses two polymorphic variants whose
tag names will have the same internal representation (and therefore be
incorrectly viewed as having the same value). The test is probably performed
somewhere in the type checker...

An alternative implementation might have been to lookup the tags (in a
perfect hash table) using a system similar to caml_named_value but I imagine
that the present method was preferred because it's simpler (and quite
possibly faster) and collisions are rare (as Eric pointed out) - although in
Jon's case the lack of a guarantee is unfortunate.

Incidentally, and off-the-subject here, using a hash table with a bucket
size of 1 is very important if you need performance guarantees on your hash
table and have some other way of coping with collisions.

David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Hash clash in polymorphic variants
  2008-01-11 18:40             ` David Allsopp
@ 2008-01-14 12:20               ` Kuba Ober
  2008-01-14 14:44                 ` Stefan Monnier
  0 siblings, 1 reply; 37+ messages in thread
From: Kuba Ober @ 2008-01-14 12:20 UTC (permalink / raw)
  To: caml-list

On Friday 11 January 2008, David Allsopp wrote:
> Kuba Ober wrote:
> > On Friday 11 January 2008, Jon Harrop wrote:
> > > On Friday 11 January 2008 13:30:29 Kuba Ober wrote:
> > > > Are those collisions of any real importance? I mean, do they break
> > > > anything? If all they do is imply linearly searching a list of a few
> > > > elements, for the colliding entry, then it's a non-issue?
> > >
> > > It would prevent code from compiling so it would be a complete
> > > show-stopper.
> >
> > So what you're saying is that the implementation uses the hash with
> > bucket
> >
> > size of 1? That's kinda poor decision, methinks.
>
> I think you're missing the context - there's no hash table. See 18.3.6 in
> the manual - the hashed values (and resulting collisions) are to do with
> the internal representation of polymorphic variants.
>
> The compiler cannot process code that uses two polymorphic variants whose
> tag names will have the same internal representation (and therefore be
> incorrectly viewed as having the same value). The test is probably
> performed somewhere in the type checker...

Yeah, I sort of put the wagon ahead of the horse. Of course the hashing 
function doesn't imply a hash table.

What I meant was simply that instead of using some fixed hash function, one 
could use a perfect hashing function which is optimal for its known set of 
inputs, and won't ever generate a collision.

The tables that such a function uses to hash its input have to be generated at 
link-time, which means run-time too.

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Hash clash in polymorphic variants
  2008-01-14 12:20               ` Kuba Ober
@ 2008-01-14 14:44                 ` Stefan Monnier
  2008-01-14 14:56                   ` [Caml-list] " Kuba Ober
  2008-01-14 17:14                   ` Jon Harrop
  0 siblings, 2 replies; 37+ messages in thread
From: Stefan Monnier @ 2008-01-14 14:44 UTC (permalink / raw)
  To: caml-list

> What I meant was simply that instead of using some fixed hash function, one 
> could use a perfect hashing function which is optimal for its known set of 
> inputs, and won't ever generate a collision.

The problem is that the set of inputs is not know at compile time, only
at link time.


        Stefan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-14 14:44                 ` Stefan Monnier
@ 2008-01-14 14:56                   ` Kuba Ober
  2008-01-14 15:37                     ` David Allsopp
                                       ` (2 more replies)
  2008-01-14 17:14                   ` Jon Harrop
  1 sibling, 3 replies; 37+ messages in thread
From: Kuba Ober @ 2008-01-14 14:56 UTC (permalink / raw)
  To: caml-list

On Monday 14 January 2008, Stefan Monnier wrote:
> > What I meant was simply that instead of using some fixed hash function,
> > one could use a perfect hashing function which is optimal for its known
> > set of inputs, and won't ever generate a collision.
>
> The problem is that the set of inputs is not know at compile time, only
> at link time.

As I've said in the cited post, the perfect hash generator would have to be 
invoked at link time, which shouldn't be a big deal.

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-14 14:56                   ` [Caml-list] " Kuba Ober
@ 2008-01-14 15:37                     ` David Allsopp
  2008-01-14 15:44                       ` Kuba Ober
  2008-01-14 15:45                     ` Stefan Monnier
  2008-01-15  3:36                     ` [Caml-list] " Jacques Garrigue
  2 siblings, 1 reply; 37+ messages in thread
From: David Allsopp @ 2008-01-14 15:37 UTC (permalink / raw)
  To: caml-list

Kuba Ober wrote:
> On Monday 14 January 2008, Stefan Monnier wrote:
> > > What I meant was simply that instead of using some fixed hash
> > > function, one could use a perfect hashing function which is optimal
> > > for its known set of inputs, and won't ever generate a collision.
> >
> > The problem is that the set of inputs is not know at compile time, only
> > at link time.
>
> As I've said in the cited post, the perfect hash generator would have to
> be invoked at link time, which shouldn't be a big deal.

Assuming you're talking hypothetically and designing a new runtime then,
yes, it's not a big deal.

However, this scheme could not just be dropped into the present system - it
would not work with dynamic linking because once you've hashed a polymorphic
variant tag-name you drop the name so you can't re-hash when you update your
perfect hashing function... unless you can devise a perfect hashing scheme
that hashes all the old keys to their old values and new ones to
non-clashing new values ;o)

Internally, `Foo is indistinguishable from the int 3505894* - so if
caml_hash_variant("Foo") suddenly changes value mid-program then any
previous instances of `Foo in memory cease to be equal to it!

David

* Try:
# (Obj.magic `Foo : int);;
- : int = 3505894
# (Obj.magic 3505894) = `Foo;;
- : bool = true

I don't know whether caml_hash_variant varies between version or even
platform so the actual number may be different on other systems.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-14 15:37                     ` David Allsopp
@ 2008-01-14 15:44                       ` Kuba Ober
  2008-01-14 16:03                         ` David Allsopp
  0 siblings, 1 reply; 37+ messages in thread
From: Kuba Ober @ 2008-01-14 15:44 UTC (permalink / raw)
  To: caml-list

On Monday 14 January 2008, David Allsopp wrote:
> Kuba Ober wrote:
> > On Monday 14 January 2008, Stefan Monnier wrote:
> > > > What I meant was simply that instead of using some fixed hash
> > > > function, one could use a perfect hashing function which is optimal
> > > > for its known set of inputs, and won't ever generate a collision.
> > >
> > > The problem is that the set of inputs is not know at compile time, only
> > > at link time.
> >
> > As I've said in the cited post, the perfect hash generator would have to
> > be invoked at link time, which shouldn't be a big deal.
>
> Assuming you're talking hypothetically and designing a new runtime then,
> yes, it's not a big deal.
>
> However, this scheme could not just be dropped into the present system - it
> would not work with dynamic linking because once you've hashed a
> polymorphic variant tag-name you drop the name so you can't re-hash when
> you update your perfect hashing function...

A trivial solution to that is to keep both, as obviously each time an 
equivalent of dlopen() is made, everything has to be rehashed. gperf 
is "slightly" memory-hungry, so surely it'd need to be something using a 
different algorithm. I'm talking hypothetically, but I also think it's a 
weird design decision to use those possibly-colliding hashes. String 
sorting/comparison isn't exactly a CPU killer, so couldn't the original names 
have been used instead? I admit not to knowing too many details of the 
current implementation of course ;(

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-14 15:44                       ` Kuba Ober
@ 2008-01-14 16:03                         ` David Allsopp
  0 siblings, 0 replies; 37+ messages in thread
From: David Allsopp @ 2008-01-14 16:03 UTC (permalink / raw)
  To: caml-list

Kuba Ober wrote:
> A trivial solution to that is to keep both, as obviously each time an 
> equivalent of dlopen() is made, everything has to be rehashed. gperf 
> is "slightly" memory-hungry, so surely it'd need to be something using a 
> different algorithm. I'm talking hypothetically, but I also think it's a 
> weird design decision to use those possibly-colliding hashes.

I agree that it's a bit weird - but the clashes are very rare (and the
function was designed to keep them rare for "normal" usage).

>  String sorting/comparison isn't exactly a CPU killer, so couldn't the
>  original names have been used instead?

String comparison is much slower than integer comparison... we're talking
about one CPU instruction compared to a for loop! Jon would never use them
again :o) Not to mention the storage overhead of keeping the tag names in
memory - not great if you've got long lists of `YetAnotherTag.

David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Hash clash in polymorphic variants
  2008-01-14 14:56                   ` [Caml-list] " Kuba Ober
  2008-01-14 15:37                     ` David Allsopp
@ 2008-01-14 15:45                     ` Stefan Monnier
  2008-01-15  3:36                     ` [Caml-list] " Jacques Garrigue
  2 siblings, 0 replies; 37+ messages in thread
From: Stefan Monnier @ 2008-01-14 15:45 UTC (permalink / raw)
  To: caml-list

>> > What I meant was simply that instead of using some fixed hash function,
>> > one could use a perfect hashing function which is optimal for its known
>> > set of inputs, and won't ever generate a collision.
>> 
>> The problem is that the set of inputs is not know at compile time, only
>> at link time.

> As I've said in the cited post, the perfect hash generator would have to be 
> invoked at link time, which shouldn't be a big deal.

That would require postponing the execution of the hash-function to
link-time or run-time.  Run-time is clearly undesirable, and link-time
adds yet-more complexity to the linker.

It's not a bad idea, obviously, but AFAICT the linker currently is kept
very simple.


        Stefan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-14 14:56                   ` [Caml-list] " Kuba Ober
  2008-01-14 15:37                     ` David Allsopp
  2008-01-14 15:45                     ` Stefan Monnier
@ 2008-01-15  3:36                     ` Jacques Garrigue
  2008-01-15  4:59                       ` Jon Harrop
  2 siblings, 1 reply; 37+ messages in thread
From: Jacques Garrigue @ 2008-01-15  3:36 UTC (permalink / raw)
  To: ober.14; +Cc: caml-list

From: Kuba Ober <ober.14@osu.edu>
> On Monday 14 January 2008, Stefan Monnier wrote:
> > > What I meant was simply that instead of using some fixed hash function,
> > > one could use a perfect hashing function which is optimal for its known
> > > set of inputs, and won't ever generate a collision.
> >
> > The problem is that the set of inputs is not know at compile time, only
> > at link time.
> 
> As I've said in the cited post, the perfect hash generator would have to be 
> invoked at link time, which shouldn't be a big deal.

Unfortunately, this would make marshalling between different programs
much more complicated...

Another advantage of knowing the hash function at compile time is
that you can generate efficient code for pattern matching. Since you
already know the ordering of tags, it is easy to generate a decision
tree. I didn't check very recently about efficiency for polymorphic
variants, but the depth of the decision tree is logarithmic in the
number of tags involved in the pattern matching, and if you can keep
it below 3 or 4 (about 10 tags) you can be actually faster than a
jump table.
Another comparison is with the old implementation for method calls.
Originally ocaml used your idea for methods: method hashes were
generated at initialization time. The scheme for dispatch was a two
level array, compressed by reusing buckets so that you don't use too
much memory. This meant actually 3 array accesses for a method call.
The current scheme reuses variant hashes, and implements a simple
dichotomic search, together with an index cache for each call site.
This doesn't look very efficient, but on small method tables, the
search is almost as fast as the old approach, and if the cache hits
this is much faster...

Now concerning the risks of name conflicts. The main point of
polymorphic variants is that there is only a conflict if the two tags
appear in the same type. And logically the type should stay small.
If you want to put all GLenum's inside the same type, then you may
well end up with conflicts. But what LablGL shows is that in practice
only a small number of tags are used together. So if you can partition
your set of tags so that each type has at most 64 tags, then you get
a probability conflict less than 1 per million for each type. This
seems safe enough. But if you have one type with 2000 tags, then the
probability is 1 per thousand. Not that much, but it can happen.
(p(n) is n*n / 2**32) 

Jacques Garrigue

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15  3:36                     ` [Caml-list] " Jacques Garrigue
@ 2008-01-15  4:59                       ` Jon Harrop
  2008-01-15  9:01                         ` Jacques Garrigue
  0 siblings, 1 reply; 37+ messages in thread
From: Jon Harrop @ 2008-01-15  4:59 UTC (permalink / raw)
  To: caml-list

On Tuesday 15 January 2008 03:36:21 Jacques Garrigue wrote:
> Unfortunately, this would make marshalling between different programs
> much more complicated...

Do people marshal polymorphic variants between different programs?

> Another advantage of knowing the hash function at compile time is
> that you can generate efficient code for pattern matching. Since you
> already know the ordering of tags, it is easy to generate a decision
> tree. I didn't check very recently about efficiency for polymorphic
> variants, but the depth of the decision tree is logarithmic in the
> number of tags involved in the pattern matching, and if you can keep
> it below 3 or 4 (about 10 tags) you can be actually faster than a
> jump table.

For 3-16 tags on AMD64, jump tables (ordinary variants) are 2x slower than 
decision trees (polymorphic variants) when branches are taken at random. 
However, jump tables are consistently up to 2x faster when a single branch is 
taken repeatedly. So caching jump tables is more effective at run-time 
optimizing pattern matches over ordinary variants than branch prediction is 
at optimizing decision trees for pattern matches over polymorphic variants.

So the advantage of a decision tree is probably insignificant on real code 
because it will lie between these two extremes.

> Now concerning the risks of name conflicts. The main point of
> polymorphic variants is that there is only a conflict if the two tags
> appear in the same type. And logically the type should stay small.
> If you want to put all GLenum's inside the same type, then you may
> well end up with conflicts. But what LablGL shows is that in practice
> only a small number of tags are used together.

Can LablGL's design support OpenGL extensions?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15  4:59                       ` Jon Harrop
@ 2008-01-15  9:01                         ` Jacques Garrigue
  2008-01-15 18:17                           ` Jon Harrop
  0 siblings, 1 reply; 37+ messages in thread
From: Jacques Garrigue @ 2008-01-15  9:01 UTC (permalink / raw)
  To: jon; +Cc: caml-list

From: Jon Harrop <jon@ffconsultancy.com>
> On Tuesday 15 January 2008 03:36:21 Jacques Garrigue wrote:
> > Unfortunately, this would make marshalling between different programs
> > much more complicated...
> 
> Do people marshal polymorphic variants between different programs?

Do people marshal data between different programs (or different
versions of the same program)?

> For 3-16 tags on AMD64, jump tables (ordinary variants) are 2x slower than 
> decision trees (polymorphic variants) when branches are taken at random. 
> However, jump tables are consistently up to 2x faster when a single branch is 
> taken repeatedly. So caching jump tables is more effective at run-time 
> optimizing pattern matches over ordinary variants than branch prediction is 
> at optimizing decision trees for pattern matches over polymorphic variants.
> 
> So the advantage of a decision tree is probably insignificant on real code 
> because it will lie between these two extremes.

Since the goal was never to be faster than ordinary variants, but just
obtain comparable speed, this seems good :-)

> > Now concerning the risks of name conflicts. The main point of
> > polymorphic variants is that there is only a conflict if the two tags
> > appear in the same type. And logically the type should stay small.
> > If you want to put all GLenum's inside the same type, then you may
> > well end up with conflicts. But what LablGL shows is that in practice
> > only a small number of tags are used together.
> 
> Can LablGL's design support OpenGL extensions?

I'm not sure what this means.
Since LablGL was coded by hand, adding extensions would mean modifying
it.
One might want to add a way to detect whether an extension is
available or not, but making it static does not seem a good idea: one
wouldn't even be able to compile code using an extension that is not
available.
Also, one might want to make code generation automatic, particularly
for C wrappers, to allow adding cases to functions easily. This should
be doable, but there is no infrastructure for that currently
(using CPP macros was simpler to start with...)

Jacques Garrigue


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15  9:01                         ` Jacques Garrigue
@ 2008-01-15 18:17                           ` Jon Harrop
  2008-01-15 19:20                             ` Gerd Stolpmann
                                               ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Jon Harrop @ 2008-01-15 18:17 UTC (permalink / raw)
  To: caml-list

On Tuesday 15 January 2008 09:01:42 Jacques Garrigue wrote:
> From: Jon Harrop <jon@ffconsultancy.com>
> > On Tuesday 15 January 2008 03:36:21 Jacques Garrigue wrote:
> > > Unfortunately, this would make marshalling between different programs
> > > much more complicated...
> >
> > Do people marshal polymorphic variants between different programs?
>
> Do people marshal data between different programs (or different
> versions of the same program)?

I suspect OCaml's marshalling is used almost entirely between same versions of 
the same programs.

In particular, I was advised against marshalling data between different 
versions of the same program because this is unsafe (not just type safety but 
the format used by Marshal is not ossified).

> > So the advantage of a decision tree is probably insignificant on real
> > code because it will lie between these two extremes.
>
> Since the goal was never to be faster than ordinary variants, but just
> obtain comparable speed, this seems good :-)

Yes. This would probably also work ok if you used a symbol table to store 
exact identifier names rather than just a hash. The symbol's index in the 
table would serve the same purpose as the hash.

> > > Now concerning the risks of name conflicts. The main point of
> > > polymorphic variants is that there is only a conflict if the two tags
> > > appear in the same type. And logically the type should stay small.
> > > If you want to put all GLenum's inside the same type, then you may
> > > well end up with conflicts. But what LablGL shows is that in practice
> > > only a small number of tags are used together.
> >
> > Can LablGL's design support OpenGL extensions?
>
> I'm not sure what this means.

OpenGL has an extension mechanism that can be queried at run-time. If a given 
extension is available then you can do things that you could not do before, 
such as pass a GLenum to a function that might not have accepted it without 
the extension.

> Since LablGL was coded by hand, adding extensions would mean modifying
> it.

Exactly, that is a limitation of LablGL's design and, therefore, I think it is 
was quite wrong of you to claim "LablGL shows is that in practice only a 
small number of tags are used together" when LablGL's use of small, closed 
sum types is actually a design limitation that would not be there if it 
supported all of OpenGL, i.e. the extension mechanism.

Incidentally, Xavier made a statement based upon what appears to me to be a 
similar logical error in the CUFP notes from last year that I read recently:

  "On the other hand, certain features seem somewhat unsurprisingly to be 
unimportant to industrial users. GUI toolkits are not an issue, because GUIs 
tend to be built using more mainstream tools; it seems that different 
competencies are involved in Caml and GUI development and companies "don't 
want to squander their precious Caml expertise aligning pixels". Rich 
libraries don't seem to matter in general; presumably companies are happy to 
develop these in-house. And no-one wants yet another IDE; the applications of 
interest are usually built using a variety of languages and tools anyway, so 
consistency of development environment is a lost cause."
- http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)

Xavier appears to have taken the biased sample of industrialists who already 
use OCaml despite its limitations and has drawn the conclusion that these 
limitations are not important to industrialists. I was really horrified to 
see this because, in my experience, companies are turning away from OCaml in 
droves because of exactly the limitations Xavier enumerated and I for one 
would dearly love to see them fixed.

OCaml will continue to go from strength to strength regardless but its uptake 
would be vastly faster if these problems are addressed. To take them point by 
point:

. GUIs are incredibly important (LablGTK is the world's favorite OCaml 
library!) and tens of thousands of OCaml programmers are crying out for 
proper LablGTK documentation as a first priority, many of whom are in 
industry.

. Rich libraries are incredibly important and OCaml has the potential to 
become a hugely successful commercial platform where people can buy and sell 
cross-platform libraries but OCaml needs support for shared run-time DLLs (or 
something equivalent) this before this can happen.

. An easy-to-use IDE would be an excellent way to kick-start people learning 
OCaml even if an industrial-strength IDE is intractable.

> Also, one might want to make code generation automatic, particularly
> for C wrappers, to allow adding cases to functions easily. This should
> be doable, but there is no infrastructure for that currently
> (using CPP macros was simpler to start with...)

Yes. A better FFI could also be enormously beneficial. Improving upon OCaml's 
FFI is one of the most alluring aspects of a reimplementation on LLVM, IMHO.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15 18:17                           ` Jon Harrop
@ 2008-01-15 19:20                             ` Gerd Stolpmann
  2008-01-15 22:04                               ` Jon Harrop
  2008-01-18  5:19                               ` Kuba Ober
  2008-01-16  3:26                             ` Jacques GARRIGUE
  2008-01-16 10:50                             ` Richard Jones
  2 siblings, 2 replies; 37+ messages in thread
From: Gerd Stolpmann @ 2008-01-15 19:20 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

Jon Harrop wrote:
> Incidentally, Xavier made a statement based upon what appears to me to be a 
> similar logical error in the CUFP notes from last year that I read recently:
> 
>   "On the other hand, certain features seem somewhat unsurprisingly to be 
> unimportant to industrial users. GUI toolkits are not an issue, because GUIs 
> tend to be built using more mainstream tools; it seems that different 
> competencies are involved in Caml and GUI development and companies "don't 
> want to squander their precious Caml expertise aligning pixels". Rich 
> libraries don't seem to matter in general; presumably companies are happy to 
> develop these in-house. And no-one wants yet another IDE; the applications of 
> interest are usually built using a variety of languages and tools anyway, so 
> consistency of development environment is a lost cause."
> - http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)

An interesting thesis, right? Although I wouldn't get that far, there is
some truth in it. The point, IMHO, is that OCaml will never replace
other languages in the sense that a company who uses language X for
years in product Y rewrites the code in OCaml. For what reason? The
company would run into big educational problems (learning a new
environment), would have high initial costs, and it is questionable
whether the result is better. Of course, for rewriting existing software
the company would profit from GUIs, from rich libraries etc. But I think
this does not happen.

What I see, however, is that OCaml is used where new software is
developed, in ambitious projects that start from scratch. It is simply a
fact that GUIs are not crucial in these areas (at least for the
companies I know). GUIs are seen as standard tools where nothing new
happens where OCaml could shine. If you need one, you develop it in one
of the mainstream languages.

IDEs aren't interesting right now because OCaml is mainly used by
(computer & related) scientists (and I include scientists working for
companies outside academia). IDEs are nice for beginners and for people
who do not want to know what's happening inside. They are not
interesting for companies that invent completely new types of products,
because they've hired experts that can live without (and want to live
without).

> Xavier appears to have taken the biased sample of industrialists who already 
> use OCaml despite its limitations and has drawn the conclusion that these 
> limitations are not important to industrialists. I was really horrified to 
> see this because, in my experience, companies are turning away from OCaml in 
> droves because of exactly the limitations Xavier enumerated and I for one 
> would dearly love to see them fixed.

Which companies?

I fully understand that OCaml is not well-suited for the average
company. But it is not because of missing GUIs and IDEs, but because the
language itself is too ambitious. Sorry to say that, but this is not the
mainstream and it will never be.

(I have a good friend who works for an average company, so I know what
I'm talking of. They program business apps for a commercial platform
from CA. A horrible language, but they can manage it. They are experts
for the models they use, and simply take a platform from industry.)

> OCaml will continue to go from strength to strength regardless but its uptake 
> would be vastly faster if these problems are addressed. To take them point by 
> point:
> 
> . GUIs are incredibly important (LablGTK is the world's favorite OCaml 
> library!) and tens of thousands of OCaml programmers are crying out for 
> proper LablGTK documentation as a first priority, many of whom are in 
> industry.

See this as opportunity for your next book :-)

GTK is already poorly documented, so this is not only the problem of the
LablGTK creators. Nevertheless, GTK is widely used. I don't think it's a
real problem.

> . Rich libraries are incredibly important and OCaml has the potential to 
> become a hugely successful commercial platform where people can buy and sell 
> cross-platform libraries but OCaml needs support for shared run-time DLLs (or 
> something equivalent) this before this can happen.

Do you dream or what?

I don't think that selling libraries in binary form is that important...
It is difficult anyway to do that, and why do you expect you could be
successful in a niche language? As customer I would demand to get the
source code - to lower the risks of the investment into a small
platform.

> . An easy-to-use IDE would be an excellent way to kick-start people learning 
> OCaml even if an industrial-strength IDE is intractable.
> 
> > Also, one might want to make code generation automatic, particularly
> > for C wrappers, to allow adding cases to functions easily. This should
> > be doable, but there is no infrastructure for that currently
> > (using CPP macros was simpler to start with...)
> 
> Yes. A better FFI could also be enormously beneficial. Improving upon OCaml's 
> FFI is one of the most alluring aspects of a reimplementation on LLVM, IMHO.

A general question to you: When you are complaining about so many
aspects of OCaml, why don't you invest time & money to fix them? We
would all be very thankful.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15 19:20                             ` Gerd Stolpmann
@ 2008-01-15 22:04                               ` Jon Harrop
  2008-01-16 13:48                                 ` Kuba Ober
  2008-01-18  5:33                                 ` Kuba Ober
  2008-01-18  5:19                               ` Kuba Ober
  1 sibling, 2 replies; 37+ messages in thread
From: Jon Harrop @ 2008-01-15 22:04 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list

On Tuesday 15 January 2008 19:20:09 Gerd Stolpmann wrote:
> Jon Harrop wrote:
> > Incidentally, Xavier made a statement based upon what appears to me to be
> > a similar logical error in the CUFP notes from last year that I read
> > recently:
> >
> >   "On the other hand, certain features seem somewhat unsurprisingly to be
> > unimportant to industrial users. GUI toolkits are not an issue, because
> > GUIs tend to be built using more mainstream tools; it seems that
> > different competencies are involved in Caml and GUI development and
> > companies "don't want to squander their precious Caml expertise aligning
> > pixels". Rich libraries don't seem to matter in general; presumably
> > companies are happy to develop these in-house. And no-one wants yet
> > another IDE; the applications of interest are usually built using a
> > variety of languages and tools anyway, so consistency of development
> > environment is a lost cause."
> > - http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
>
> An interesting thesis, right? Although I wouldn't get that far, there is
> some truth in it. The point, IMHO, is that OCaml will never replace
> other languages in the sense that a company who uses language X for
> years in product Y rewrites the code in OCaml. For what reason? The
> company would run into big educational problems (learning a new
> environment), would have high initial costs, and it is questionable
> whether the result is better. Of course, for rewriting existing software
> the company would profit from GUIs, from rich libraries etc. But I think
> this does not happen.

I believe many more companies would migrate to OCaml if it had well-documented 
GUI APIs and rich libraries. Indeed, Microsoft are gambling on people 
migrating to F# in exactly the same way.

> What I see, however, is that OCaml is used where new software is
> developed, in ambitious projects that start from scratch. It is simply a
> fact that GUIs are not crucial in these areas (at least for the
> companies I know).

But the companies you know were already self-selected to be the ones who do 
not care about OCaml's limitations, so it is a biased sample?

> GUIs are seen as standard tools where nothing new happens where OCaml could
> shine.

I have no doubt that OCaml would shine in GUIs just as it does elsewhere.

> If you need one, you develop it in one of the mainstream languages. 

Actually I would either choose F# on Windows or give up on any other OS.

> IDEs aren't interesting right now because OCaml is mainly used by
> (computer & related) scientists (and I include scientists working for
> companies outside academia).

Many of the world's most sophisticated IDEs are targetted solely at technical 
users. Look at Mathematica's notebook interface, for example. I believe that 
is a great example to aspire to.

> IDEs are nice for beginners and for people 
> who do not want to know what's happening inside. They are not
> interesting for companies that invent completely new types of products,
> because they've hired experts that can live without (and want to live
> without).

I couldn't disagree more. Pharmaceuticals are a trillion dollar industry where 
many scientists would benefit enormously from being able to use a tool like 
OCaml without knowing anything about how it works in order to create their 
next generation products (drugs). The same is true of most industries where 
scientists and engineers work and there are many such industries and there 
are extremely profitable.

> > Xavier appears to have taken the biased sample of industrialists who
> > already use OCaml despite its limitations and has drawn the conclusion
> > that these limitations are not important to industrialists. I was really
> > horrified to see this because, in my experience, companies are turning
> > away from OCaml in droves because of exactly the limitations Xavier
> > enumerated and I for one would dearly love to see them fixed.
>
> Which companies?

General Electric, Microsoft, Wolfram Research and various bioinformatics 
institutes for example.

Look at General Electric. They build some of the world's most sophisticated 
medical scanners and that large-scale embedded market is ideal for using 
languages like OCaml for its high-performance numerics because you have 
complete control over the environment. However, they desperately need GUI 
toolkits to provide a front-end for users.

I'd like to know what Alex Barretta makes of this, for example. His glass 
cutters must have the same characteristics in this respect...

> I fully understand that OCaml is not well-suited for the average
> company. But it is not because of missing GUIs and IDEs, but because the
> language itself is too ambitious. Sorry to say that, but this is not the
> mainstream and it will never be.

I still think OCaml has the best chance of any FPL to become a mainstream tool 
in technical computing.

Indeed, I recently tried to quantify how far OCaml has already come and I 
believe it is already as popular as C# among technical users, for example. 
That is quite an achievement!

> (I have a good friend who works for an average company, so I know what
> I'm talking of. They program business apps for a commercial platform
> from CA. A horrible language, but they can manage it. They are experts
> for the models they use, and simply take a platform from industry.)

Yes. I do not believe OCaml will make significant inroads into displacing 
COBOL and relatives but there are a lot of other big opportunities out there 
for such a language.

> > OCaml will continue to go from strength to strength regardless but its
> > uptake would be vastly faster if these problems are addressed. To take
> > them point by point:
> >
> > . GUIs are incredibly important (LablGTK is the world's favorite OCaml
> > library!) and tens of thousands of OCaml programmers are crying out for
> > proper LablGTK documentation as a first priority, many of whom are in
> > industry.
>
> See this as opportunity for your next book :-)

Indeed. Even after the announcement that Microsoft are productizing F#, OCaml 
for Scientists continues to be our biggest earning product. Consequently, I 
am very tempted to write a "sequel" that covers many of the important aspects 
of the language that I did not cover in the original, including GUI 
programming, XML, parallelism and so forth. If anyone has ideas for subjects 
they would like to see covered, please e-mail me!

> GTK is already poorly documented, so this is not only the problem of the
> LablGTK creators. Nevertheless, GTK is widely used. I don't think it's a
> real problem.

Yes. I'm really not sure what the best course of action would be here. Would 
Qt bindings be preferable? Is it worth the hassle? How long would it be 
before they reached the maturity of GTK?

I think we would really need more high-profile open source programs with 
hundreds of thousands of users testing the bindings (as GTK has had) before 
you could really gamble on it.

> > . Rich libraries are incredibly important and OCaml has the potential to
> > become a hugely successful commercial platform where people can buy and
> > sell cross-platform libraries but OCaml needs support for shared run-time
> > DLLs (or something equivalent) this before this can happen.
>
> Do you dream or what?

One man's reality is another man's dream. :-)

> I don't think that selling libraries in binary form is that important...

If it were possible then it would be important to me because I could earn a 
living from it. I'm sure the same is true for many other people.

> It is difficult anyway to do that, and why do you expect you could be
> successful in a niche language?

Because I already am. :-)

> As customer I would demand to get the source code - to lower the risks of
> the investment into a small platform.

Nobody ever got fired for buying IBM.

Historically, we've made a lot more money from sales of binaries than from 
sales of source code. Consequently, I would be more than willing to gamble on 
selling shared run-time DLLs for OCaml users if it were possible.

> > Yes. A better FFI could also be enormously beneficial. Improving upon
> > OCaml's FFI is one of the most alluring aspects of a reimplementation on
> > LLVM, IMHO.
>
> A general question to you: When you are complaining about so many
> aspects of OCaml, why don't you invest time & money to fix them?

An excellent idea!

So I wrote to Xavier Leroy and asked about contributing to INRIA's OCaml 
distribution. Xavier explained that French copyright law makes it 
prohibitively difficult for him to include my code contributions so this will 
never be possible. The best I could think of was to suggest that they make it 
possible for users to pay to get certain bugs fixed or functionality 
implemented. I'm not sure that will happen though.

I wrote to Pierre Weis and asked what the likelihood of getting some tweaks 
into the language was. He said that it is unlikely I could even get a "try .. 
finally" construct put in.

So there's no way I can improve INRIA's OCaml distribution. Next, I thought 
perhaps a complete fork of OCaml would be a viable alternative. This is 
complicated by OCaml's license which requires variants to be distributed with 
the core sources intact and everything else as patches to it. This is not an 
insurmountable problem, of course, you just distribute the core and a giant 
autogenerated patch instead. So I asked Sylvain about getting Debian to adopt 
the fork rather than INRIA's upstream. He said this will almost certainly not 
happen.

So I can't develop or contribute to INRIA's OCaml implementation and I can't 
fork it without starting with zero users. What about reimplementing it?

So I wondered what I could build upon that would make this as painless as 
possible. This led me to the Smoke VM, Mono, the JVM and LLVM. I enumerated 
each of these in turn and came to the conclusion that LLVM is preferable, not 
least because several other people had already drawn the same conclusion and 
started work on similar projects themselves.

That's when I wrote my 100LOC test program calling LLVM from OCaml. Since 
then, Gordon has been working hard on the OCaml bindings and example 
programs, which are now nothing short of incredible. Dozens of people have 
e-mailed me expressing their desire to contribute to such an effort.

This will take time, of course, but I believe it is the future of the OCaml 
language.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15 22:04                               ` Jon Harrop
@ 2008-01-16 13:48                                 ` Kuba Ober
  2008-01-16 15:02                                   ` Dario Teixeira
  2008-01-18  5:33                                 ` Kuba Ober
  1 sibling, 1 reply; 37+ messages in thread
From: Kuba Ober @ 2008-01-16 13:48 UTC (permalink / raw)
  To: caml-list

On Tuesday 15 January 2008, Jon Harrop wrote:
> On Tuesday 15 January 2008 19:20:09 Gerd Stolpmann wrote:
> > Jon Harrop wrote:
> > > Incidentally, Xavier made a statement based upon what appears to me to
> > > be a similar logical error in the CUFP notes from last year that I read
> > > recently:
> > >
> > >   "On the other hand, certain features seem somewhat unsurprisingly to
> > > be unimportant to industrial users. GUI toolkits are not an issue,
> > > because GUIs tend to be built using more mainstream tools; it seems
> > > that different competencies are involved in Caml and GUI development
> > > and companies "don't want to squander their precious Caml expertise
> > > aligning pixels". Rich libraries don't seem to matter in general;
> > > presumably companies are happy to develop these in-house. And no-one
> > > wants yet another IDE; the applications of interest are usually built
> > > using a variety of languages and tools anyway, so consistency of
> > > development environment is a lost cause."
> > > - http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
> >
> > An interesting thesis, right? Although I wouldn't get that far, there is
> > some truth in it. The point, IMHO, is that OCaml will never replace
> > other languages in the sense that a company who uses language X for
> > years in product Y rewrites the code in OCaml. For what reason? The
> > company would run into big educational problems (learning a new
> > environment), would have high initial costs, and it is questionable
> > whether the result is better. Of course, for rewriting existing software
> > the company would profit from GUIs, from rich libraries etc. But I think
> > this does not happen.
>
> I believe many more companies would migrate to OCaml if it had
> well-documented GUI APIs and rich libraries. Indeed, Microsoft are gambling
> on people migrating to F# in exactly the same way.
>
> > What I see, however, is that OCaml is used where new software is
> > developed, in ambitious projects that start from scratch. It is simply a
> > fact that GUIs are not crucial in these areas (at least for the
> > companies I know).
>
> But the companies you know were already self-selected to be the ones who do
> not care about OCaml's limitations, so it is a biased sample?
>
> > GUIs are seen as standard tools where nothing new happens where OCaml
> > could shine.
>
> I have no doubt that OCaml would shine in GUIs just as it does elsewhere.

In fact, after some initial thinking and looking around it seems that the 
only "sane" GUI for OCaml, at this time, is Qt, but someone has to write a 
machine translator to port it from C++ to OCaml. Qt is reasonably well 
designed, and has the richest feature set of all GUI toolkits, even if you 
combined all the competition and treated it as one "other" toolkit.

Using Qt with some machine (or not!) generated bindings is just a huge 
waste -- it's a nice, clean design, which has recently been tweaked for 
performance (some Qt4 apps start in 50% of the time just by having been 
ported to Qt4 from Qt3).

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-16 13:48                                 ` Kuba Ober
@ 2008-01-16 15:02                                   ` Dario Teixeira
  2008-01-16 19:00                                     ` Jon Harrop
  2008-01-17 13:09                                     ` Kuba Ober
  0 siblings, 2 replies; 37+ messages in thread
From: Dario Teixeira @ 2008-01-16 15:02 UTC (permalink / raw)
  To: Kuba Ober, caml-list

Hi,

> In fact, after some initial thinking and looking around it seems that the 
> only "sane" GUI for OCaml, at this time, is Qt, but someone has to write a 
> machine translator to port it from C++ to OCaml. Qt is reasonably well 
> designed, and has the richest feature set of all GUI toolkits, even if you 
> combined all the competition and treated it as one "other" toolkit.
> 
> Using Qt with some machine (or not!) generated bindings is just a huge 
> waste -- it's a nice, clean design, which has recently been tweaked for 
> performance (some Qt4 apps start in 50% of the time just by having been 
> ported to Qt4 from Qt3).

I'm inclined to agree.  I would even go as far as saying that the lack of
Qt bindings is perhaps the biggest open sore as far as Ocaml library support
is concerned.

The guys at Trolltech, however, seem quite keen on having Qt on as many
platforms as possible (Qt-Jambi, which brings Qt to the JVM is one of their
products).  Couldn't this whole auto-generation of bindings be made easier
if they got involved?  I am sure they already have plenty of tools in
place to facilitate it.  Even if they were not to commit actual manpower
to the effort, they might still be able to help.

And incidentally, the afore mentioned Qt-Jambi, together with the Ocamljava
project might provide a last-resort solution in the absence of native bindings.
Another possibility might be the Qyoto/Kimono project (which brings Qt/KDE
into .net) together with the OcamlIL project (if it's still alive).  You would
then use Mono to run Ocaml programmes.

cheers,
Dario

      __________________________________________________________
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-16 15:02                                   ` Dario Teixeira
@ 2008-01-16 19:00                                     ` Jon Harrop
  2008-01-17 13:09                                     ` Kuba Ober
  1 sibling, 0 replies; 37+ messages in thread
From: Jon Harrop @ 2008-01-16 19:00 UTC (permalink / raw)
  To: caml-list

On Wednesday 16 January 2008 15:02:54 Dario Teixeira wrote:
> I'm inclined to agree.  I would even go as far as saying that the lack of
> Qt bindings is perhaps the biggest open sore as far as Ocaml library
> support is concerned.

As I understand it, OCaml's FFI makes writing Qt bindings an enormous 
undertaking which is why we don't have any.

I'm happy with GTK for now and would rather see OpenGL 2 bindings instead.

> The guys at Trolltech, however, seem quite keen on having Qt on as many
> platforms as possible (Qt-Jambi, which brings Qt to the JVM is one of their
> products).  Couldn't this whole auto-generation of bindings be made easier
> if they got involved?  I am sure they already have plenty of tools in
> place to facilitate it.  Even if they were not to commit actual manpower
> to the effort, they might still be able to help.

I found TrollTech's customer support awful as a customer so I very much doubt 
they will go out of their way to help a really obscure virgin corner of the 
Qt market. That was a few years ago though.

> And incidentally, the afore mentioned Qt-Jambi, together with the Ocamljava
> project might provide a last-resort solution in the absence of native
> bindings. Another possibility might be the Qyoto/Kimono project (which
> brings Qt/KDE into .net) together with the OcamlIL project (if it's still
> alive).  You would then use Mono to run Ocaml programmes.

I evaluated various such options recently and decided that Mono is truly awful 
(very poorly written, unreliable and slow) and LLVM is absolutely superb 
(extremely well-written C++ with complete native OCaml bindings!). Moreover, 
Mono appears to have no future in its current form whereas LLVM has serious 
backers and is improving at a tremendous rate.

Even if you don't want to implement a whole new language or backend, using 
LLVM's JIT compilation for code generation has great potential for OCaml, 
e.g. regexps. I highly recommend giving it a play!

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-16 15:02                                   ` Dario Teixeira
  2008-01-16 19:00                                     ` Jon Harrop
@ 2008-01-17 13:09                                     ` Kuba Ober
  1 sibling, 0 replies; 37+ messages in thread
From: Kuba Ober @ 2008-01-17 13:09 UTC (permalink / raw)
  To: caml-list

> > Using Qt with some machine (or not!) generated bindings is just a huge
> > waste -- it's a nice, clean design, which has recently been tweaked for
> > performance (some Qt4 apps start in 50% of the time just by having been
> > ported to Qt4 from Qt3).
>
> I'm inclined to agree.  I would even go as far as saying that the lack of
> Qt bindings is perhaps the biggest open sore as far as Ocaml library
> support is concerned.
>
> The guys at Trolltech, however, seem quite keen on having Qt on as many
> platforms as possible (Qt-Jambi, which brings Qt to the JVM is one of their
> products).  Couldn't this whole auto-generation of bindings be made easier
> if they got involved?

At some point, in order to "naturally" use Qt and benefit from its 
performance, the machine translation will be easier than any bindings you 
could think of. IMHO, of course. Qt's code itself will become smaller in 
Ocaml - I've hacked at porting QObject, and so far I've got the line count to 
50% of Trolltech's. And I'm a total noob.

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15 22:04                               ` Jon Harrop
  2008-01-16 13:48                                 ` Kuba Ober
@ 2008-01-18  5:33                                 ` Kuba Ober
  1 sibling, 0 replies; 37+ messages in thread
From: Kuba Ober @ 2008-01-18  5:33 UTC (permalink / raw)
  To: caml-list

On Tuesday 15 January 2008, Jon Harrop wrote:
> On Tuesday 15 January 2008 19:20:09 Gerd Stolpmann wrote:
> > Jon Harrop wrote:
> > > Incidentally, Xavier made a statement based upon what appears to me to
> > > be a similar logical error in the CUFP notes from last year that I read
> > > recently:
> > >
> > >   "On the other hand, certain features seem somewhat unsurprisingly to
> > > be unimportant to industrial users. GUI toolkits are not an issue,
> > > because GUIs tend to be built using more mainstream tools; it seems
> > > that different competencies are involved in Caml and GUI development
> > > and companies "don't want to squander their precious Caml expertise
> > > aligning pixels". Rich libraries don't seem to matter in general;
> > > presumably companies are happy to develop these in-house. And no-one
> > > wants yet another IDE; the applications of interest are usually built
> > > using a variety of languages and tools anyway, so consistency of
> > > development environment is a lost cause."
> > > - http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
> >
> > An interesting thesis, right? Although I wouldn't get that far, there is
> > some truth in it. The point, IMHO, is that OCaml will never replace
> > other languages in the sense that a company who uses language X for
> > years in product Y rewrites the code in OCaml. For what reason? The
> > company would run into big educational problems (learning a new
> > environment), would have high initial costs, and it is questionable
> > whether the result is better. Of course, for rewriting existing software
> > the company would profit from GUIs, from rich libraries etc. But I think
> > this does not happen.
>
> I believe many more companies would migrate to OCaml if it had
> well-documented GUI APIs and rich libraries. Indeed, Microsoft are gambling
> on people migrating to F# in exactly the same way.
>
> > What I see, however, is that OCaml is used where new software is
> > developed, in ambitious projects that start from scratch. It is simply a
> > fact that GUIs are not crucial in these areas (at least for the
> > companies I know).
>
> But the companies you know were already self-selected to be the ones who do
> not care about OCaml's limitations, so it is a biased sample?
>
> > GUIs are seen as standard tools where nothing new happens where OCaml
> > could shine.
>
> I have no doubt that OCaml would shine in GUIs just as it does elsewhere.
>
> > If you need one, you develop it in one of the mainstream languages.
>
> Actually I would either choose F# on Windows or give up on any other OS.
>
> > IDEs aren't interesting right now because OCaml is mainly used by
> > (computer & related) scientists (and I include scientists working for
> > companies outside academia).
>
> Many of the world's most sophisticated IDEs are targetted solely at
> technical users. Look at Mathematica's notebook interface, for example. I
> believe that is a great example to aspire to.
>
> > IDEs are nice for beginners and for people
> > who do not want to know what's happening inside. They are not
> > interesting for companies that invent completely new types of products,
> > because they've hired experts that can live without (and want to live
> > without).
>
> I couldn't disagree more. Pharmaceuticals are a trillion dollar industry
> where many scientists would benefit enormously from being able to use a
> tool like OCaml without knowing anything about how it works in order to
> create their next generation products (drugs). The same is true of most
> industries where scientists and engineers work and there are many such
> industries and there are extremely profitable.
>
> > > Xavier appears to have taken the biased sample of industrialists who
> > > already use OCaml despite its limitations and has drawn the conclusion
> > > that these limitations are not important to industrialists. I was
> > > really horrified to see this because, in my experience, companies are
> > > turning away from OCaml in droves because of exactly the limitations
> > > Xavier enumerated and I for one would dearly love to see them fixed.
> >
> > Which companies?
>
> General Electric, Microsoft, Wolfram Research and various bioinformatics
> institutes for example.
>
> Look at General Electric. They build some of the world's most sophisticated
> medical scanners and that large-scale embedded market is ideal for using
> languages like OCaml for its high-performance numerics because you have
> complete control over the environment. However, they desperately need GUI
> toolkits to provide a front-end for users.
>
> I'd like to know what Alex Barretta makes of this, for example. His glass
> cutters must have the same characteristics in this respect...
>
> > I fully understand that OCaml is not well-suited for the average
> > company. But it is not because of missing GUIs and IDEs, but because the
> > language itself is too ambitious. Sorry to say that, but this is not the
> > mainstream and it will never be.
>
> I still think OCaml has the best chance of any FPL to become a mainstream
> tool in technical computing.
>
> Indeed, I recently tried to quantify how far OCaml has already come and I
> believe it is already as popular as C# among technical users, for example.
> That is quite an achievement!
>
> > (I have a good friend who works for an average company, so I know what
> > I'm talking of. They program business apps for a commercial platform
> > from CA. A horrible language, but they can manage it. They are experts
> > for the models they use, and simply take a platform from industry.)
>
> Yes. I do not believe OCaml will make significant inroads into displacing
> COBOL and relatives but there are a lot of other big opportunities out
> there for such a language.
>
> > > OCaml will continue to go from strength to strength regardless but its
> > > uptake would be vastly faster if these problems are addressed. To take
> > > them point by point:
> > >
> > > . GUIs are incredibly important (LablGTK is the world's favorite OCaml
> > > library!) and tens of thousands of OCaml programmers are crying out for
> > > proper LablGTK documentation as a first priority, many of whom are in
> > > industry.
> >
> > See this as opportunity for your next book :-)
>
> Indeed. Even after the announcement that Microsoft are productizing F#,
> OCaml for Scientists continues to be our biggest earning product.
> Consequently, I am very tempted to write a "sequel" that covers many of the
> important aspects of the language that I did not cover in the original,
> including GUI programming, XML, parallelism and so forth. If anyone has
> ideas for subjects they would like to see covered, please e-mail me!
>
> > GTK is already poorly documented, so this is not only the problem of the
> > LablGTK creators. Nevertheless, GTK is widely used. I don't think it's a
> > real problem.
>
> Yes. I'm really not sure what the best course of action would be here.
> Would Qt bindings be preferable? Is it worth the hassle? How long would it
> be before they reached the maturity of GTK?

Making bindings for Qt is basically putting a beautiful architecture to waste.
Qt's architecture is good enough to be actually machine-translated into OCaml. 
This would be an involved project, but not impossible.

Using Qt from OCaml via a set of bindings can be a short-term stop-gap measure 
for trivial applications, I would never deploy a Qt application written in 
OCaml if the application was any bigger on the GUI side than a couple simple 
dialog boxes. There is a binding generator (forgot its name) which can 
generate OCaml bindings for Qt, but you have to give it a list of 
classes/methods/signals/slots to generate bindings for. So perfect for 
trivial applications, but not much else.

Qt, when you start to think of its API in how it may look in OCaml, becomes 
pretty cool, and I'm sure there are a few improvements to it you can make to 
leverage the power given to you by OCaml, once you loose the shackles of C++. 

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15 19:20                             ` Gerd Stolpmann
  2008-01-15 22:04                               ` Jon Harrop
@ 2008-01-18  5:19                               ` Kuba Ober
  2008-01-18  5:39                                 ` Kuba Ober
  1 sibling, 1 reply; 37+ messages in thread
From: Kuba Ober @ 2008-01-18  5:19 UTC (permalink / raw)
  To: caml-list

On Tuesday 15 January 2008, Gerd Stolpmann wrote:
> Jon Harrop wrote:
> > Incidentally, Xavier made a statement based upon what appears to me to be
> > a similar logical error in the CUFP notes from last year that I read
> > recently:
> >
> >   "On the other hand, certain features seem somewhat unsurprisingly to be
> > unimportant to industrial users. GUI toolkits are not an issue, because
> > GUIs tend to be built using more mainstream tools; it seems that
> > different competencies are involved in Caml and GUI development and
> > companies "don't want to squander their precious Caml expertise aligning
> > pixels". Rich libraries don't seem to matter in general; presumably
> > companies are happy to develop these in-house. And no-one wants yet
> > another IDE; the applications of interest are usually built using a
> > variety of languages and tools anyway, so consistency of development
> > environment is a lost cause."
> > - http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
>
> An interesting thesis, right? Although I wouldn't get that far, there is
> some truth in it. The point, IMHO, is that OCaml will never replace
> other languages in the sense that a company who uses language X for
> years in product Y rewrites the code in OCaml. For what reason? The
> company would run into big educational problems (learning a new
> environment), would have high initial costs, and it is questionable
> whether the result is better. Of course, for rewriting existing software
> the company would profit from GUIs, from rich libraries etc. But I think
> this does not happen.
>
> What I see, however, is that OCaml is used where new software is
> developed, in ambitious projects that start from scratch. It is simply a
> fact that GUIs are not crucial in these areas (at least for the
> companies I know). GUIs are seen as standard tools where nothing new
> happens where OCaml could shine. If you need one, you develop it in one
> of the mainstream languages.
>
> IDEs aren't interesting right now because OCaml is mainly used by
> (computer & related) scientists (and I include scientists working for
> companies outside academia). IDEs are nice for beginners and for people
> who do not want to know what's happening inside. They are not
> interesting for companies that invent completely new types of products,
> because they've hired experts that can live without (and want to live
> without).
>
> > Xavier appears to have taken the biased sample of industrialists who
> > already use OCaml despite its limitations and has drawn the conclusion
> > that these limitations are not important to industrialists. I was really
> > horrified to see this because, in my experience, companies are turning
> > away from OCaml in droves because of exactly the limitations Xavier
> > enumerated and I for one would dearly love to see them fixed.
>
> Which companies?
>
> I fully understand that OCaml is not well-suited for the average
> company. But it is not because of missing GUIs and IDEs, but because the
> language itself is too ambitious. Sorry to say that, but this is not the
> mainstream and it will never be.
>
> (I have a good friend who works for an average company, so I know what
> I'm talking of. They program business apps for a commercial platform
> from CA. A horrible language, but they can manage it. They are experts
> for the models they use, and simply take a platform from industry.)
>
> > OCaml will continue to go from strength to strength regardless but its
> > uptake would be vastly faster if these problems are addressed. To take
> > them point by point:
> >
> > . GUIs are incredibly important (LablGTK is the world's favorite OCaml
> > library!) and tens of thousands of OCaml programmers are crying out for
> > proper LablGTK documentation as a first priority, many of whom are in
> > industry.
>
> See this as opportunity for your next book :-)
>
> GTK is already poorly documented, so this is not only the problem of the
> LablGTK creators. Nevertheless, GTK is widely used. I don't think it's a
> real problem.
>
> > . Rich libraries are incredibly important and OCaml has the potential to
> > become a hugely successful commercial platform where people can buy and
> > sell cross-platform libraries but OCaml needs support for shared run-time
> > DLLs (or something equivalent) this before this can happen.
>
> Do you dream or what?
>
> I don't think that selling libraries in binary form is that important...
> It is difficult anyway to do that, and why do you expect you could be
> successful in a niche language? As customer I would demand to get the
> source code - to lower the risks of the investment into a small
> platform.

Yeah, I wouldn't be using Qt if there was no source code for it. Quite a few 
times over the years I had to tweak away at the implementation details.

In fact, I would never specify *any* mission-critical libraries or frameworks 
if they didn't come with full sources.

Cheers, Kuba


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-18  5:19                               ` Kuba Ober
@ 2008-01-18  5:39                                 ` Kuba Ober
  0 siblings, 0 replies; 37+ messages in thread
From: Kuba Ober @ 2008-01-18  5:39 UTC (permalink / raw)
  To: caml-list

> > > . Rich libraries are incredibly important and OCaml has the potential
> > > to become a hugely successful commercial platform where people can buy
> > > and sell cross-platform libraries but OCaml needs support for shared
> > > run-time DLLs (or something equivalent) this before this can happen.
> >
> > Do you dream or what?
> >
> > I don't think that selling libraries in binary form is that important...
> > It is difficult anyway to do that, and why do you expect you could be
> > successful in a niche language? As customer I would demand to get the
> > source code - to lower the risks of the investment into a small
> > platform.
>
> Yeah, I wouldn't be using Qt if there was no source code for it. Quite a
> few times over the years I had to tweak away at the implementation details.
>
> In fact, I would never specify *any* mission-critical libraries or
> frameworks if they didn't come with full sources.

In other words, Jon: if you tried to sell me source-code-less libraries, I 
simply wouldn't buy, and no amount of persuading could change that. I'd still 
keep buying your books, though :)

Just look at what happened to scores of Delphi and OCX controls which became 
abandonware, and how much of this stuff eventually had to be simply 
reimplemented by the same people who originally bought the controls not to 
implement them in the first place. I detest closed-source controls and 
libraries, I simply don't use them. The whole idea of "here's the OCX and a 
typelib, and a help file, take it or leave it" is preposterous. Well, maybe 
it's fine if you're being contracted for a one-off job where the payee has no 
clue, and your morals don't seem to interfere -- sure then you can reuse all 
the source-less crap you want. But as a part of a long term strategy? No way.

If there was one decision Trolls made right, it was to include the source 
code.

Cheers, Kuba

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15 18:17                           ` Jon Harrop
  2008-01-15 19:20                             ` Gerd Stolpmann
@ 2008-01-16  3:26                             ` Jacques GARRIGUE
  2008-01-16  3:34                               ` Yaron Minsky
  2008-01-16  4:40                               ` Jon Harrop
  2008-01-16 10:50                             ` Richard Jones
  2 siblings, 2 replies; 37+ messages in thread
From: Jacques GARRIGUE @ 2008-01-16  3:26 UTC (permalink / raw)
  To: jon; +Cc: caml-list

> > From: Jon Harrop <jon@ffconsultancy.com>
> > > On Tuesday 15 January 2008 03:36:21 Jacques Garrigue wrote:
> > > > Unfortunately, this would make marshalling between different programs
> > > > much more complicated...
> > >
> > > Do people marshal polymorphic variants between different programs?
> >
> > Do people marshal data between different programs (or different
> > versions of the same program)?
> 
> I suspect OCaml's marshalling is used almost entirely between same
> versions of the same programs.

I'm not so sure. Actually, I do it all the time when recompiling
ocaml. Otherwise I would have to bootstrap after any modification in
the compiler. Fortunately, this is not the case, and one only needs to
bootstrap when the data structures are modified (or semantics changed).

> In particular, I was advised against marshalling data between different 
> versions of the same program because this is unsafe (not just type
> safety but the format used by Marshal is not ossified).

Marshalling data between different versions of the same program is ok,
but you're on your own concerning compatibility. You must be careful
concerning changes in ocaml versions, but I don't remember any change
in representation, and if one were to happen it would be amply
documented.

> > > So the advantage of a decision tree is probably insignificant on real
> > > code because it will lie between these two extremes.
> >
> > Since the goal was never to be faster than ordinary variants, but just
> > obtain comparable speed, this seems good :-)
> 
> Yes. This would probably also work ok if you used a symbol table to store 
> exact identifier names rather than just a hash. The symbol's index in the 
> table would serve the same purpose as the hash.

No, because in order to produce efficient code you have to know the
hash at compile time, and in your scheme you only know it at link time
or runtime. 

> OpenGL has an extension mechanism that can be queried at
> run-time. If a given extension is available then you can do things
> that you could not do before, such as pass a GLenum to a function
> that might not have accepted it without the extension.
> 
> > Since LablGL was coded by hand, adding extensions would mean modifying
> > it.
> 
> Exactly, that is a limitation of LablGL's design and, therefore, I think it is 
> was quite wrong of you to claim "LablGL shows is that in practice only a 
> small number of tags are used together" when LablGL's use of small, closed 
> sum types is actually a design limitation that would not be there if it 
> supported all of OpenGL, i.e. the extension mechanism.

I don't see your point. Even with the extension mechanism, extra
GLenum's are still only allowed for some specific functions. So you
can still define some subsets of GLenum that should be conflict free,
you don't need to prohibit all conflicts in GLenum. This is what I
mean by lablGL's design.

The problem with lablGL and extensions is the implementation, not the
API design. What we would need was some kind of AOP approach to the
stubs, where you could describe what functions are extended by which
extensions.

> Incidentally, Xavier made a statement based upon what appears to me to be a 
> similar logical error in the CUFP notes from last year that I read recently:
> 
>   "On the other hand, certain features seem somewhat unsurprisingly to be 
> unimportant to industrial users. GUI toolkits are not an issue, because GUIs 
> tend to be built using more mainstream tools; it seems that different 
> competencies are involved in Caml and GUI development and companies "don't 
> want to squander their precious Caml expertise aligning pixels". Rich 
> libraries don't seem to matter in general; presumably companies are happy to 
> develop these in-house. And no-one wants yet another IDE; the applications of 
> interest are usually built using a variety of languages and tools anyway, so 
> consistency of development environment is a lost cause."
> - http://cufp.galois.com/CUFP-2007-Report.pdf (page 3)
> 
> Xavier appears to have taken the biased sample of industrialists who already 
> use OCaml despite its limitations and has drawn the conclusion that these 
> limitations are not important to industrialists. I was really horrified to 
> see this because, in my experience, companies are turning away from OCaml in 
> droves because of exactly the limitations Xavier enumerated and I for one 
> would dearly love to see them fixed.

I don't agree with all these points (otherwise I wouldn't be
maintaining a GUI toolkit), but there is some truth in it. I actually
got similar reactions from industry in Japan, if for different
reasons: they don't need the GUI, because they prefer to do it
themselves, to differentiate from others. People doing in-house
programming have a different point of view. I remember somebody from a
bank who told me he wrote a program to be used in all their branches
using labltk. In this case you don't need anything flashy, it just has
to be functional (err, to work).

Concerning IDEs, since eclipse is more and more used, good support
for it seems a must. But you won't have me use anything other than
emacs and ocamlbrowser!

> > Also, one might want to make code generation automatic, particularly
> > for C wrappers, to allow adding cases to functions easily. This should
> > be doable, but there is no infrastructure for that currently
> > (using CPP macros was simpler to start with...)
> 
> Yes. A better FFI could also be enormously beneficial. Improving
> upon OCaml's FFI is one of the most alluring aspects of a
> reimplementation on LLVM, IMHO.

The current FFI works well, but it's true that the way it cuts the
work in small pieces (stubs in C on one side, externals on the other)
makes it difficult to automate its use. In my experience it is very
flexible, but badly lacks abstraction.

Jacques Garrigue

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-16  3:26                             ` Jacques GARRIGUE
@ 2008-01-16  3:34                               ` Yaron Minsky
  2008-01-16  3:42                                 ` Jon Harrop
  2008-01-16  4:40                               ` Jon Harrop
  1 sibling, 1 reply; 37+ messages in thread
From: Yaron Minsky @ 2008-01-16  3:34 UTC (permalink / raw)
  To: Jacques GARRIGUE; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1449 bytes --]

On Jan 15, 2008 10:26 PM, Jacques GARRIGUE <garrigue@math.nagoya-u.ac.jp>
wrote:

>
> I'm not so sure. Actually, I do it all the time when recompiling
> ocaml. Otherwise I would have to bootstrap after any modification in
> the compiler. Fortunately, this is not the case, and one only needs to
> bootstrap when the data structures are modified (or semantics changed).
>

I agree.  We quite often use marshal to share data between different
programs that share a common library.


> I don't agree with all these points (otherwise I wouldn't be
> maintaining a GUI toolkit), but there is some truth in it. I actually
> got similar reactions from industry in Japan, if for different
> reasons: they don't need the GUI, because they prefer to do it
> themselves, to differentiate from others. People doing in-house
> programming have a different point of view. I remember somebody from a
> bank who told me he wrote a program to be used in all their branches
> using labltk. In this case you don't need anything flashy, it just has
> to be functional (err, to work).
>

We started out doing entirely back-end processes using OCaml, but as time
went on, we started building more and more GUIs.  The fact that OCaml has
lablgtk makes it much more useful for us, without a doubt.  The main reason
we like to do GUIs in OCaml is that we see a lot of value in sharing type
definitions and code between the GUIs and the back-end services they connect
to.

y

[-- Attachment #2: Type: text/html, Size: 1989 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-16  3:34                               ` Yaron Minsky
@ 2008-01-16  3:42                                 ` Jon Harrop
  0 siblings, 0 replies; 37+ messages in thread
From: Jon Harrop @ 2008-01-16  3:42 UTC (permalink / raw)
  To: caml-list

On Wednesday 16 January 2008 03:34:54 Yaron Minsky wrote:
> We started out doing entirely back-end processes using OCaml, but as time
> went on, we started building more and more GUIs.  The fact that OCaml has
> lablgtk makes it much more useful for us, without a doubt.  The main reason
> we like to do GUIs in OCaml is that we see a lot of value in sharing type
> definitions and code between the GUIs and the back-end services they
> connect to.

Yes, this is exactly the kind of thing I was referring to. I think a lot of 
people want simple GUIs that are perfectly feasible to construct entirely in 
OCaml and the overhead of splitting a project across languages is much 
higher. Fortunately, LablGTK makes this feasible in OCaml.

There must be some reason why LablGTK is so popular! ;-)

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-16  3:26                             ` Jacques GARRIGUE
  2008-01-16  3:34                               ` Yaron Minsky
@ 2008-01-16  4:40                               ` Jon Harrop
  2008-01-16 16:03                                 ` Eric Cooper
  1 sibling, 1 reply; 37+ messages in thread
From: Jon Harrop @ 2008-01-16  4:40 UTC (permalink / raw)
  To: caml-list

On Wednesday 16 January 2008 03:26:27 Jacques GARRIGUE wrote:
> > I suspect OCaml's marshalling is used almost entirely between same
> > versions of the same programs.
>
> I'm not so sure. Actually, I do it all the time when recompiling
> ocaml. Otherwise I would have to bootstrap after any modification in
> the compiler. Fortunately, this is not the case, and one only needs to
> bootstrap when the data structures are modified (or semantics changed).

Interesting.

> > Yes. This would probably also work ok if you used a symbol table to store
> > exact identifier names rather than just a hash. The symbol's index in the
> > table would serve the same purpose as the hash.
>
> No, because in order to produce efficient code you have to know the
> hash at compile time, and in your scheme you only know it at link time
> or runtime.

You could still use the same hashing scheme but you could fall back to linear 
search of symbols by name in the event of a clash.

> > Exactly, that is a limitation of LablGL's design and, therefore, I think
> > it is was quite wrong of you to claim "LablGL shows is that in practice
> > only a small number of tags are used together" when LablGL's use of
> > small, closed sum types is actually a design limitation that would not be
> > there if it supported all of OpenGL, i.e. the extension mechanism.
>
> I don't see your point. Even with the extension mechanism, extra
> GLenum's are still only allowed for some specific functions. So you
> can still define some subsets of GLenum that should be conflict free,
> you don't need to prohibit all conflicts in GLenum. This is what I
> mean by lablGL's design.

Provided you can enumerate which tags can be used with which functions 
including the presence of extensions, yes. I suppose that would be possible 
and you could end up with many small sets of tags and much less chance of 
clashing.

> The problem with lablGL and extensions is the implementation, not the
> API design. What we would need was some kind of AOP approach to the
> stubs, where you could describe what functions are extended by which
> extensions.

I think it would be better to remove all complexity from the C stubs, have 
them all autogenerated and then write a higher-level API on top entirely in 
OCaml. GLCaml is the start of a good foundation for OpenGL, IMHO. I think it 
would be very productive to merge the projects at some point.

> ...
> I don't agree with all these points (otherwise I wouldn't be
> maintaining a GUI toolkit), but there is some truth in it. I actually
> got similar reactions from industry in Japan, if for different
> reasons: they don't need the GUI, because they prefer to do it
> themselves, to differentiate from others. People doing in-house
> programming have a different point of view. I remember somebody from a
> bank who told me he wrote a program to be used in all their branches
> using labltk. In this case you don't need anything flashy, it just has
> to be functional (err, to work).
>
> Concerning IDEs, since eclipse is more and more used, good support
> for it seems a must. But you won't have me use anything other than
> emacs and ocamlbrowser!

Visual Studio's Intellisense makes GUI programming much easier in F# than 
ocamlbrowser+ocaml. I think the single most productive thing that could be 
added to ocamlbrowser is hyperlinks from the quoted definitions to all 
related definitions.

Now that I come to think of it, you can just run ocamldoc on the LablGTK 
sources and use a browser to do that. Is the ocamldoc HTML output for the 
latest LablGTK2 on the web anywhere?

> > Yes. A better FFI could also be enormously beneficial. Improving
> > upon OCaml's FFI is one of the most alluring aspects of a
> > reimplementation on LLVM, IMHO.
>
> The current FFI works well, but it's true that the way it cuts the
> work in small pieces (stubs in C on one side, externals on the other)
> makes it difficult to automate its use. In my experience it is very
> flexible, but badly lacks abstraction.

What sorts of abstractions would you like?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-16  4:40                               ` Jon Harrop
@ 2008-01-16 16:03                                 ` Eric Cooper
  0 siblings, 0 replies; 37+ messages in thread
From: Eric Cooper @ 2008-01-16 16:03 UTC (permalink / raw)
  To: caml-list

On Wed, Jan 16, 2008 at 04:40:09AM +0000, Jon Harrop wrote:
> Is the ocamldoc HTML output for the latest LablGTK2 on the web anywhere?

In Debian, it's included in the liblablgtk2-ocaml-dev package in
/usr/share/doc/liblablgtk2-ocaml-dev/html/api/, and similarly for the
other OCaml -dev packages.

-- 
Eric Cooper             e c c @ c m u . e d u


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-15 18:17                           ` Jon Harrop
  2008-01-15 19:20                             ` Gerd Stolpmann
  2008-01-16  3:26                             ` Jacques GARRIGUE
@ 2008-01-16 10:50                             ` Richard Jones
  2 siblings, 0 replies; 37+ messages in thread
From: Richard Jones @ 2008-01-16 10:50 UTC (permalink / raw)
  To: caml-list

On Tue, Jan 15, 2008 at 06:17:32PM +0000, Jon Harrop wrote:
> . GUIs are incredibly important (LablGTK is the world's favorite OCaml 
> library!) and tens of thousands of OCaml programmers are crying out for 
> proper LablGTK documentation as a first priority, many of whom are in 
> industry.

GTK itself is horribly undocumented.  However SooHyoung Oh has done an
excellent job translating the C-based GTK 2.0 tutorial into OCaml,
here:

http://plus.kaist.ac.kr/~shoh/ocaml/lablgtk2/lablgtk2-tutorial/

> . Rich libraries are incredibly important and OCaml has the
> potential to become a hugely successful commercial platform where
> people can buy and sell cross-platform libraries but OCaml needs
> support for shared run-time DLLs (or something equivalent) this
> before this can happen.

My requirement is similar to this: (1) to be able to take OCaml
libraries and automatically generate C bindings from them (ie.
translate the OCaml .mli file into a .h file, and generate stubs).
(2) to be able to ship the library as a DLL / .so file.  Efficiency is
not so much of a concern for me - eg. if the generated stubs worked by
copying all strings passed, that would be OK for my requirements.

I actually did a little bit of work on a stub/wrapper generator, and I
think it is possible to implement it, especially now that ocamlopt can
generate PIC.

Rich.

-- 
Richard Jones
Red Hat

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-14 14:44                 ` Stefan Monnier
  2008-01-14 14:56                   ` [Caml-list] " Kuba Ober
@ 2008-01-14 17:14                   ` Jon Harrop
  2008-01-14 17:36                     ` Alain Frisch
  1 sibling, 1 reply; 37+ messages in thread
From: Jon Harrop @ 2008-01-14 17:14 UTC (permalink / raw)
  To: caml-list

On Monday 14 January 2008 14:44:58 Stefan Monnier wrote:
> > What I meant was simply that instead of using some fixed hash function,
> > one could use a perfect hashing function which is optimal for its known
> > set of inputs, and won't ever generate a collision.
>
> The problem is that the set of inputs is not know at compile time, only
> at link time.

Yes. I think this is another case where OCaml would really benefit from a 
symbol table and this is something else that seems much easier to do with JIT 
compilation.

Also, what happens if you try to dynamically load two libraries that use 
polymorphic variants that clash?

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Re: Hash clash in polymorphic variants
  2008-01-14 17:14                   ` Jon Harrop
@ 2008-01-14 17:36                     ` Alain Frisch
  0 siblings, 0 replies; 37+ messages in thread
From: Alain Frisch @ 2008-01-14 17:36 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

Jon Harrop wrote:
> Also, what happens if you try to dynamically load two libraries that use 
> polymorphic variants that clash?

AFAIK, this is ok. The problematic clashes can always be detected at 
type-checking time.

-- Alain


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] Hash clash in polymorphic variants
  2008-01-10 17:09 Hash clash in polymorphic variants Jon Harrop
  2008-01-10 20:35 ` [Caml-list] " Eric Cooper
@ 2008-01-11  0:15 ` Jacques Garrigue
  1 sibling, 0 replies; 37+ messages in thread
From: Jacques Garrigue @ 2008-01-11  0:15 UTC (permalink / raw)
  To: jon; +Cc: caml-list

From: Jon Harrop <jon@ffconsultancy.com>

> ISTR advice that constructors sharing the first few characters should be 
> avoided in order to reduce the likelihood of clashing hash values for 
> polymorphic variants. Is that right?

Not at all. If the first characters are identical it just means that an
identical value will be added to the hashes of the suffixes, which
actually means that you lower the probability of getting conflicts :-)
The hash functions guarantees that all keys of strictly less than 5
characters will map to different.

The probability of getting clashes being really low, you should not be
concerned by this. Just check aferwards. A simple way to do it is to
produce a big type containing all the tags, and feed it to ocamlc.

> I'm interested in automatically translating the GL_* enum from
> OpenGL into polymorphic variants. So although it is generated code I
> have little control over it, e.g. I cannot change the translation as
> OpenGL gets extended because code will already be using the existing
> names.

In the event you get a conflict when openGL is extended, you can still
add a special case for the newly added tags. I hope this does not
happen, but the birthday theorem tells you that when you get enough
participants, clashes are hard to avoid.

Cheers,

Jacques Garrigue

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2008-01-18  5:39 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-10 17:09 Hash clash in polymorphic variants Jon Harrop
2008-01-10 20:35 ` [Caml-list] " Eric Cooper
2008-01-10 21:24   ` Jon Harrop
2008-01-10 21:40     ` David Allsopp
2008-01-11 13:30       ` Kuba Ober
2008-01-11 13:48         ` Jon Harrop
2008-01-11 16:14           ` Kuba Ober
2008-01-11 18:40             ` David Allsopp
2008-01-14 12:20               ` Kuba Ober
2008-01-14 14:44                 ` Stefan Monnier
2008-01-14 14:56                   ` [Caml-list] " Kuba Ober
2008-01-14 15:37                     ` David Allsopp
2008-01-14 15:44                       ` Kuba Ober
2008-01-14 16:03                         ` David Allsopp
2008-01-14 15:45                     ` Stefan Monnier
2008-01-15  3:36                     ` [Caml-list] " Jacques Garrigue
2008-01-15  4:59                       ` Jon Harrop
2008-01-15  9:01                         ` Jacques Garrigue
2008-01-15 18:17                           ` Jon Harrop
2008-01-15 19:20                             ` Gerd Stolpmann
2008-01-15 22:04                               ` Jon Harrop
2008-01-16 13:48                                 ` Kuba Ober
2008-01-16 15:02                                   ` Dario Teixeira
2008-01-16 19:00                                     ` Jon Harrop
2008-01-17 13:09                                     ` Kuba Ober
2008-01-18  5:33                                 ` Kuba Ober
2008-01-18  5:19                               ` Kuba Ober
2008-01-18  5:39                                 ` Kuba Ober
2008-01-16  3:26                             ` Jacques GARRIGUE
2008-01-16  3:34                               ` Yaron Minsky
2008-01-16  3:42                                 ` Jon Harrop
2008-01-16  4:40                               ` Jon Harrop
2008-01-16 16:03                                 ` Eric Cooper
2008-01-16 10:50                             ` Richard Jones
2008-01-14 17:14                   ` Jon Harrop
2008-01-14 17:36                     ` Alain Frisch
2008-01-11  0:15 ` [Caml-list] " Jacques Garrigue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox