Re: [Caml-list] Re: How to read different ints from a Bigarray?

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
@ 2009-11-03 17:16 Charles Forsyth
  0 siblings, 0 replies; 23+ messages in thread
From: Charles Forsyth @ 2009-11-03 17:16 UTC (permalink / raw)
  To: caml-list, goswin-v-b

>And mips.

and powerpc (if you'd like to run caml on Blue Gene, and why not)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 20:27                           ` Richard Jones
@ 2009-11-03 13:18                             ` Goswin von Brederlow
  0 siblings, 0 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-11-03 13:18 UTC (permalink / raw)
  To: Richard Jones; +Cc: caml-list

Richard Jones <rich@annexia.org> writes:

> On Mon, Nov 02, 2009 at 05:33:24PM +0100, Mauricio Fernandez wrote:
>> It might be possible to hack support for C-- expressions in external
>> declarations. That'd be a sort of portable assembler.
>
> To be honest I'm far more interested in x86-64-specific instructions
> (SSE3/4 in particular).  There are only two processor architectures
> that matter in the world in any practical sense, x86-64 and ARM.
>
> Rich.

And mips.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 16:33                         ` Mauricio Fernandez
  2009-11-02 20:27                           ` Richard Jones
@ 2009-11-02 20:48                           ` Goswin von Brederlow
  1 sibling, 0 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-11-02 20:48 UTC (permalink / raw)
  To: caml-list

Mauricio Fernandez <mfp@acm.org> writes:

> On Mon, Nov 02, 2009 at 05:11:27PM +0100, Goswin von Brederlow wrote:
>> Richard Jones <rich@annexia.org> writes:
>> 
>> > On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
>> >> But C calls are still 33% slower than direct access in ocaml (if one
>> >> doesn't use the polymorphic functions).
>> >
>> > Are you using noalloc calls?
>> >
>> > http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html
>> 
>> Yes. And I looked at the bigarray module and couldn't figure out how
>> they differ from my own external function. Only difference I see is
>> the leading "%" on the external name. What does that do?
>
> That means that it is using a hardcoded OCaml primitive, whose code can be
> generated by the compiler via C--. See asmcomp/cmmgen.ml.
>
>> > I would love to see inline assembler supported by the compiler.
>
> It might be possible to hack support for C-- expressions in external
> declarations. That'd be a sort of portable assembler.

This brings me a lot closer to a fast buffer structure. I know have
this code:

(* buffer.ml: Buffer module for libaio-ocaml
 * Copyright (C) 2009 Goswin von Brederlow
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU Lesser General Public License as
 * published by the Free Software Foundation, either version 3 of the
 * License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 * Under Debian a copy can be found in /usr/share/common-licenses/LGPL-3.
 *)

open Bigarray

type buffer = (int, int8_unsigned_elt, c_layout) Array1.t

exception Unaligned

let create size = (Array1.create int8_unsigned c_layout size : buffer)

let unsafe_get_uint8 (buf : buffer) off = Array1.unsafe_get buf off

let unsafe_get_uint16 (buf : buffer) off =
  let off = off asr 1 in
  let buf = ((Obj.magic buf) : (int, int16_unsigned_elt, c_layout) Array1.t) 
  in
    Array1.unsafe_get buf off

let unsafe_get_int31 (buf : buffer) off =
  let off = off asr 2 in
  let buf = ((Obj.magic buf) : (int32, int32, c_layout) Array1.t) in
  let x = Array1.unsafe_get buf off
  in
    Int32.to_int x

let unsafe_get_int63 (buf : buffer) off =
  let off = off asr 3 in
  let buf = ((Obj.magic buf) : (int, int, c_layout) Array1.t)
  in
    Array1.unsafe_get buf off


Looking at the generated code I see that this works nicely for 8 and
16bit:

0000000000404a50 <camlBuffer__unsafe_get_uint8_131>:
  404a50:       48 d1 fb                sar    %rbx
  404a53:       48 8b 40 08             mov    0x8(%rax),%rax
  404a57:       48 0f b6 04 18          movzbq (%rax,%rbx,1),%rax
  404a5c:       48 8d 44 00 01          lea    0x1(%rax,%rax,1),%rax
  404a61:       c3                      retq   

0000000000404a90 <camlBuffer__unsafe_get_uint16_137>:
  404a90:       48 d1 fb                sar    %rbx
  404a93:       48 83 cb 01             or     $0x1,%rbx
  404a97:       48 d1 fb                sar    %rbx
  404a9a:       48 8b 40 08             mov    0x8(%rax),%rax
  404a9e:       48 0f b7 04 58          movzwq (%rax,%rbx,2),%rax
  404aa3:       48 8d 44 00 01          lea    0x1(%rax,%rax,1),%rax
  404aa8:       c3                      retq   

But for 31/63 bits I get:

0000000000404b90 <camlBuffer__unsafe_get_int31_145>:
  404b90:       48 83 ec 08             sub    $0x8,%rsp
  404b94:       48 c1 fb 02             sar    $0x2,%rbx
  404b98:       48 83 cb 01             or     $0x1,%rbx
  404b9c:       48 89 c7                mov    %rax,%rdi
  404b9f:       48 89 de                mov    %rbx,%rsi
  404ba2:       48 8b 05 5f bc 21 00    mov    0x21bc5f(%rip),%rax        # 620808 <_DYNAMIC+0x7e0>
  404ba9:       e8 92 2a 01 00          callq  417640 <caml_c_call>
  404bae:       48 63 40 08             movslq 0x8(%rax),%rax
  404bb2:       48 d1 e0                shl    %rax
  404bb5:       48 83 c8 01             or     $0x1,%rax
  404bb9:       48 83 c4 08             add    $0x8,%rsp
  404bbd:       c3                      retq   

0000000000404ca0 <camlBuffer__unsafe_get_int63_154>:
  404ca0:       48 83 ec 08             sub    $0x8,%rsp
  404ca4:       48 c1 fb 03             sar    $0x3,%rbx
  404ca8:       48 83 cb 01             or     $0x1,%rbx
  404cac:       48 89 c7                mov    %rax,%rdi
  404caf:       48 89 de                mov    %rbx,%rsi
  404cb2:       48 8b 05 4f bb 21 00    mov    0x21bb4f(%rip),%rax        # 620808 <_DYNAMIC+0x7e0>
  404cb9:       e8 82 29 01 00          callq  417640 <caml_c_call>
  404cbe:       48 83 c4 08             add    $0x8,%rsp
  404cc2:       c3                      retq   

At least in the int63 case I would have thought the compiler would
emit asm code to read the int instead of a function call. In the 31bit
case I would have hoped it would optimize the intermittend int32 away.

Is there something I can do better to get_int31? I was hoping for code
like this:

0000000000404a90 <camlBuffer__unsafe_get_uint31_137>:
  404c90:       48 c1 fb 03             sar    $0x3,%rbx
  404a94:       48 83 cb 01             or     $0x1,%rbx
  404a98:       48 d1 fb                sar    %rbx
  404a9b:       48 8b 40 08             mov    0x8(%rax),%rax
  404a9f:       xx xx xx xx xx          movzwq (%rax,%rbx,4),%rax
  404aa4:       48 8d 44 00 01          lea    0x1(%rax,%rax,1),%rax
  404aa9:       c3                      retq   

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 16:33                         ` Mauricio Fernandez
@ 2009-11-02 20:27                           ` Richard Jones
  2009-11-03 13:18                             ` Goswin von Brederlow
  2009-11-02 20:48                           ` Goswin von Brederlow
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Jones @ 2009-11-02 20:27 UTC (permalink / raw)
  To: caml-list

On Mon, Nov 02, 2009 at 05:33:24PM +0100, Mauricio Fernandez wrote:
> It might be possible to hack support for C-- expressions in external
> declarations. That'd be a sort of portable assembler.

To be honest I'm far more interested in x86-64-specific instructions
(SSE3/4 in particular).  There are only two processor architectures
that matter in the world in any practical sense, x86-64 and ARM.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 16:11                       ` Goswin von Brederlow
@ 2009-11-02 16:33                         ` Mauricio Fernandez
  2009-11-02 20:27                           ` Richard Jones
  2009-11-02 20:48                           ` Goswin von Brederlow
  0 siblings, 2 replies; 23+ messages in thread
From: Mauricio Fernandez @ 2009-11-02 16:33 UTC (permalink / raw)
  To: caml-list

On Mon, Nov 02, 2009 at 05:11:27PM +0100, Goswin von Brederlow wrote:
> Richard Jones <rich@annexia.org> writes:
> 
> > On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
> >> But C calls are still 33% slower than direct access in ocaml (if one
> >> doesn't use the polymorphic functions).
> >
> > Are you using noalloc calls?
> >
> > http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html
> 
> Yes. And I looked at the bigarray module and couldn't figure out how
> they differ from my own external function. Only difference I see is
> the leading "%" on the external name. What does that do?

That means that it is using a hardcoded OCaml primitive, whose code can be
generated by the compiler via C--. See asmcomp/cmmgen.ml.

> > I would love to see inline assembler supported by the compiler.

It might be possible to hack support for C-- expressions in external
declarations. That'd be a sort of portable assembler.

-- 
Mauricio Fernandez  -   http://eigenclass.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-01 19:57                     ` Richard Jones
@ 2009-11-02 16:11                       ` Goswin von Brederlow
  2009-11-02 16:33                         ` Mauricio Fernandez
  0 siblings, 1 reply; 23+ messages in thread
From: Goswin von Brederlow @ 2009-11-02 16:11 UTC (permalink / raw)
  To: Richard Jones; +Cc: Goswin von Brederlow, caml-list

Richard Jones <rich@annexia.org> writes:

> On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
>> But C calls are still 33% slower than direct access in ocaml (if one
>> doesn't use the polymorphic functions).
>
> Are you using noalloc calls?
>
> http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html

Yes. And I looked at the bigarray module and couldn't figure out how
they differ from my own external function. Only difference I see is
the leading "%" on the external name. What does that do?

> I would love to see inline assembler supported by the compiler.
>
> Rich.

And some primitive operations on integers like sign extending and byte
swapping in the Pervasives module where the compiler emits cpu
specific code instead of a caml/C call.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-01 15:11                   ` Goswin von Brederlow
@ 2009-11-01 19:57                     ` Richard Jones
  2009-11-02 16:11                       ` Goswin von Brederlow
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Jones @ 2009-11-01 19:57 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
> But C calls are still 33% slower than direct access in ocaml (if one
> doesn't use the polymorphic functions).

Are you using noalloc calls?

http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html

I would love to see inline assembler supported by the compiler.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-30 20:30                 ` Richard Jones
@ 2009-11-01 15:11                   ` Goswin von Brederlow
  2009-11-01 19:57                     ` Richard Jones
  0 siblings, 1 reply; 23+ messages in thread
From: Goswin von Brederlow @ 2009-11-01 15:11 UTC (permalink / raw)
  To: Richard Jones; +Cc: Goswin von Brederlow, caml-list

Richard Jones <rich@annexia.org> writes:

> On Thu, Oct 29, 2009 at 06:07:59PM +0100, Goswin von Brederlow wrote:
>> I still can reuse a lot of this. Esspecially the syntax extension
>> seems like a good idea. Maybe reduced to bytes instead of bits
>> though. I don't intend to use such fine grained structures to need bit
>> access.
>
> Take a close look at bitstring.  In all the cases where it can
> *statically* determine that accesses are on byte or larger boundaries,
> it does *not* do any bitfiddling but uses the most efficient, direct C
> calls possible.
>
> We really did spend a lot of time optimizing the bitmatch case.
>
> Rich.

But C calls are still 33% slower than direct access in ocaml (if one
doesn't use the polymorphic functions).

What would be great would be to use whatever Bigarray uses to get the
compiler to emit direct access to the data instead of C calls. Time to
hit the source.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 17:07               ` Goswin von Brederlow
@ 2009-10-30 20:30                 ` Richard Jones
  2009-11-01 15:11                   ` Goswin von Brederlow
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Jones @ 2009-10-30 20:30 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

On Thu, Oct 29, 2009 at 06:07:59PM +0100, Goswin von Brederlow wrote:
> I still can reuse a lot of this. Esspecially the syntax extension
> seems like a good idea. Maybe reduced to bytes instead of bits
> though. I don't intend to use such fine grained structures to need bit
> access.

Take a close look at bitstring.  In all the cases where it can
*statically* determine that accesses are on byte or larger boundaries,
it does *not* do any bitfiddling but uses the most efficient, direct C
calls possible.

We really did spend a lot of time optimizing the bitmatch case.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 23:43         ` Goswin von Brederlow
@ 2009-10-30  0:48           ` Gerd Stolpmann
  0 siblings, 0 replies; 23+ messages in thread
From: Gerd Stolpmann @ 2009-10-30  0:48 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Florian Weimer, caml-list


Am Freitag, den 30.10.2009, 00:43 +0100 schrieb Goswin von Brederlow:
> Gerd Stolpmann <gerd@gerd-stolpmann.de> writes:
> 
> > Am Donnerstag, den 29.10.2009, 21:40 +0100 schrieb Florian Weimer:
> >> * Goswin von Brederlow:
> >> 
> >> > - The data is passed to libaio and needs to be kept alive and unmoved
> >> >   as long as libaio knows it.
> >> 
> >> It also has to be aligned to a 512-byte boundary, so you can use
> >> O_DIRECT.  Linux does not support truely asynchronous I/O without
> >> O_DIRECT AFAIK, which rarely makes it worth the trouble.
> >
> > Right. There is also the question whether aio for regular files (i.e.
> > files backed by page cache) is continued to be supported at all - it is
> > well known that Linus Torvalds doesn't like it. It can happen that at
> > some day aio will be restricted to block devices only.
> >
> > So I wouldn't use it for production code, but it is of course still an
> > interesting interface.
> >
> > Gerd
> 
> Damn. That seems so stupid. Then writing asynchronous will only be
> possible with creating a pot full of worker thread, each one writing
> one chunk. So you get all those chunks in random order submitted to
> the kernel, the kernel has to reorder them, fit them back together,
> write them and then wake up the right thread for each piece
> completed. So much extra work while libaio has all the data already in
> perfect structures for the kernel.

Well, this is exactly the implementation of the POSIX aio functions in
glibc. They are mapped to a bunch of threads.

> And how will you do barriers when writing with threads? Wait for all
> threads to complete every time you hit a barrier and thereby stalling
> the pipeline?

You can't implement barriers. When you have page-cache backed I/O (i.e.
non-direct I/O, no matter of aio or sync I/O) there is no control when
data is written. Ok, there is fsync but this is very coarse-grained
control.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 21:04       ` Gerd Stolpmann
@ 2009-10-29 23:43         ` Goswin von Brederlow
  2009-10-30  0:48           ` Gerd Stolpmann
  0 siblings, 1 reply; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 23:43 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Florian Weimer, Goswin von Brederlow, caml-list

Gerd Stolpmann <gerd@gerd-stolpmann.de> writes:

> Am Donnerstag, den 29.10.2009, 21:40 +0100 schrieb Florian Weimer:
>> * Goswin von Brederlow:
>> 
>> > - The data is passed to libaio and needs to be kept alive and unmoved
>> >   as long as libaio knows it.
>> 
>> It also has to be aligned to a 512-byte boundary, so you can use
>> O_DIRECT.  Linux does not support truely asynchronous I/O without
>> O_DIRECT AFAIK, which rarely makes it worth the trouble.
>
> Right. There is also the question whether aio for regular files (i.e.
> files backed by page cache) is continued to be supported at all - it is
> well known that Linus Torvalds doesn't like it. It can happen that at
> some day aio will be restricted to block devices only.
>
> So I wouldn't use it for production code, but it is of course still an
> interesting interface.
>
> Gerd

Damn. That seems so stupid. Then writing asynchronous will only be
possible with creating a pot full of worker thread, each one writing
one chunk. So you get all those chunks in random order submitted to
the kernel, the kernel has to reorder them, fit them back together,
write them and then wake up the right thread for each piece
completed. So much extra work while libaio has all the data already in
perfect structures for the kernel.

And how will you do barriers when writing with threads? Wait for all
threads to complete every time you hit a barrier and thereby stalling
the pipeline?

MfG
        Goswin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 20:40     ` Florian Weimer
  2009-10-29 21:04       ` Gerd Stolpmann
@ 2009-10-29 23:38       ` Goswin von Brederlow
  1 sibling, 0 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 23:38 UTC (permalink / raw)
  To: Florian Weimer; +Cc: caml-list

Florian Weimer <fw@deneb.enyo.de> writes:

> * Goswin von Brederlow:
>
>> - The data is passed to libaio and needs to be kept alive and unmoved
>>   as long as libaio knows it.
>
> It also has to be aligned to a 512-byte boundary, so you can use
> O_DIRECT.  Linux does not support truely asynchronous I/O without
> O_DIRECT AFAIK, which rarely makes it worth the trouble.

True. But the libaio can provide a Aio.Buffer.make that returns an
aligned Bigarray (or string or whatever, currently a custom type).

If you write to files on a filesystem without O_DIRECT it will block
when submitting the requests till they have completed. Not sure what
happens on block devices without O_DIRECT.

My use case is for a Fuse Filesystem and writing to disks. O_DIRECT is
quite alright there. If you can't use O_DIRECT then you are left with
going multithreaded.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 18:48     ` Sylvain Le Gall
@ 2009-10-29 23:25       ` Goswin von Brederlow
  0 siblings, 0 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 23:25 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> On 29-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Xavier Leroy <Xavier.Leroy@inria.fr> writes:
>>> Goswin von Brederlow wrote:
>>
>> Here are some benchmark results:
>>
>> get an int out of a string:
>>                 C               Ocaml
>>   uint8  le     19.496          17.433
>>    int8  le     19.298          17.850
>>   uint16 le     19.427          25.046
>>    int16 le     19.383          27.664
>>   uint16 be     20.502          23.200
>>    int16 be     20.350          27.535
>>
>> get an int out of a Bigarray.Array1.t:
>> 		safe		unsafe
>>   uint8  le	55.194s		54.508s
>>   uint64 le     80.51s		81.46s
>>
>
> Can you provide us with the corresponding code and benchmark? 
>
> Maybe you can just commit this in libaio/test/bench.ml.
>
> Regards,
> Sylvain Le Gall

As Christophe guessed the problem was polymorphic functions. If I
specify a fixed Array1 type then the compiler uses the optimized
access functions. Makes unsafe Bigarray slightly faster than unsafe
string actually (must not optimize int_of_char/Char.unsafe_chr away)
and that independent of argument size (on set, on get allocating
int32/int64 costs time so they are slower).

So Bigarray is the fastest but getting different types out of a
Bigarray will be tricky. Unaligned even more so if not impossible.

I have to sleep on this. Maybe in my use case I can have all
structures int64 aligned and then split the int64 up in ocaml where
structures have smaller members. Would have been too much to have a
Bigarray with access functions for any type. Maybe some little wrapper
with Obj.Magic will do *hide*.

As for libaio it should be easy to make it create and use any Bigarray
type.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 20:40     ` Florian Weimer
@ 2009-10-29 21:04       ` Gerd Stolpmann
  2009-10-29 23:43         ` Goswin von Brederlow
  2009-10-29 23:38       ` Goswin von Brederlow
  1 sibling, 1 reply; 23+ messages in thread
From: Gerd Stolpmann @ 2009-10-29 21:04 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Goswin von Brederlow, caml-list


Am Donnerstag, den 29.10.2009, 21:40 +0100 schrieb Florian Weimer:
> * Goswin von Brederlow:
> 
> > - The data is passed to libaio and needs to be kept alive and unmoved
> >   as long as libaio knows it.
> 
> It also has to be aligned to a 512-byte boundary, so you can use
> O_DIRECT.  Linux does not support truely asynchronous I/O without
> O_DIRECT AFAIK, which rarely makes it worth the trouble.

Right. There is also the question whether aio for regular files (i.e.
files backed by page cache) is continued to be supported at all - it is
well known that Linus Torvalds doesn't like it. It can happen that at
some day aio will be restricted to block devices only.

So I wouldn't use it for production code, but it is of course still an
interesting interface.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 15:00   ` [Caml-list] " Goswin von Brederlow
  2009-10-28 15:17     ` Sylvain Le Gall
@ 2009-10-29 20:40     ` Florian Weimer
  2009-10-29 21:04       ` Gerd Stolpmann
  2009-10-29 23:38       ` Goswin von Brederlow
  1 sibling, 2 replies; 23+ messages in thread
From: Florian Weimer @ 2009-10-29 20:40 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

* Goswin von Brederlow:

> - The data is passed to libaio and needs to be kept alive and unmoved
>   as long as libaio knows it.

It also has to be aligned to a 512-byte boundary, so you can use
O_DIRECT.  Linux does not support truely asynchronous I/O without
O_DIRECT AFAIK, which rarely makes it worth the trouble.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 12:20             ` Richard Jones
@ 2009-10-29 17:07               ` Goswin von Brederlow
  2009-10-30 20:30                 ` Richard Jones
  0 siblings, 1 reply; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 17:07 UTC (permalink / raw)
  To: caml-list

Richard Jones <rich@annexia.org> writes:

> On Thu, Oct 29, 2009 at 10:50:31AM +0100, Goswin von Brederlow wrote:
>> but no
>> 
>> let unparse_foo (x, y) =
>>   bitmake { x : 16 : littleendian; y : 16 : littleendian } x y
>
> See:
>
> http://et.redhat.com/~rjones/bitstring/html/Bitstring.html#2_Constructingbitstrings
>
> I don't necessarily think bitstring is suitable here though because
> you still need to read your data into a string (or fake a string on
> the C heap as Olivier Andrieu mentioned).  I think in this case you'd
> be better off just writing this part of the code in C.
>
> Rich.

I still can reuse a lot of this. Esspecially the syntax extension
seems like a good idea. Maybe reduced to bytes instead of bits
though. I don't intend to use such fine grained structures to need bit
access.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29  9:50           ` Goswin von Brederlow
  2009-10-29 10:34             ` Goswin von Brederlow
@ 2009-10-29 12:20             ` Richard Jones
  2009-10-29 17:07               ` Goswin von Brederlow
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Jones @ 2009-10-29 12:20 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: blue storm, Sylvain Le Gall, caml-list

On Thu, Oct 29, 2009 at 10:50:31AM +0100, Goswin von Brederlow wrote:
> but no
> 
> let unparse_foo (x, y) =
>   bitmake { x : 16 : littleendian; y : 16 : littleendian } x y

See:

http://et.redhat.com/~rjones/bitstring/html/Bitstring.html#2_Constructingbitstrings

I don't necessarily think bitstring is suitable here though because
you still need to read your data into a string (or fake a string on
the C heap as Olivier Andrieu mentioned).  I think in this case you'd
be better off just writing this part of the code in C.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29  9:50           ` Goswin von Brederlow
@ 2009-10-29 10:34             ` Goswin von Brederlow
  2009-10-29 12:20             ` Richard Jones
  1 sibling, 0 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 10:34 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: blue storm, Sylvain Le Gall, caml-list

Goswin von Brederlow <goswin-v-b@web.de> writes:

> blue storm <bluestorm.dylc@gmail.com> writes:
>
>> On Wed, Oct 28, 2009 at 6:57 PM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>> Maybe ideal would be a format string based interface that calls C with
>>> a format string and a record of values. Because what I really need is
>>> to read/write records in an architecture independend way. Something
>>> like
>>>
>>> type t = { x:int; y:char; z:int64 }
>>> let t_format = "%2u%c%8d"
>>>
>>> put_formated buf t_format t
>>>
>>> But how to get that type safe? Maybe a camlp4 module that generates
>>> the format string and type from a single declaration so they always
>>> match.
>>
>> It's possibly off-topic, but you might be interested in Richard
>> Jones's Bitstring project [1] wich deals with similar issues quite
>> nicely in my opinion.
>>
>> [1] http://code.google.com/p/bitstring/
>
> No, quite on-topic.
>
> I glanced at the examples and code and it looks to me though as if
> this can only parse bitstrings but not create them from a pattern.
> You have
>
> let parse_foo bits =
>   bitmatch bits with
>   | { x : 16 : littleendian; y : 16 : littleendian } -> fun x y -> (x, y)
>
> but no
>
> let unparse_foo (x, y) =
>   bitmake { x : 16 : littleendian; y : 16 : littleendian } x y
>
>
> Idealy would be something along
>
> let pattern = make_pattern { x : 16 : littleendian; y : 16 : littleendian }
> let parse_foo bits = parse pattern (fun x y -> (x, y))
> let unparse_foo (x, y) = unparse pattern x y
>
> But I know how to do that with CPS already. I just need the primitives
> to get/set the basic types.
>
> MfG
>         Goswin

And I was wrong. There is

http://code.google.com/p/bitstring/source/browse/trunk/examples/make_ipv4_header.ml

as an example. Not ideal since parsing and unparsing will duplicate
the pattern definition but that will be locale for each type.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 22:48         ` blue storm
@ 2009-10-29  9:50           ` Goswin von Brederlow
  2009-10-29 10:34             ` Goswin von Brederlow
  2009-10-29 12:20             ` Richard Jones
  0 siblings, 2 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-29  9:50 UTC (permalink / raw)
  To: blue storm; +Cc: Goswin von Brederlow, Sylvain Le Gall, caml-list

blue storm <bluestorm.dylc@gmail.com> writes:

> On Wed, Oct 28, 2009 at 6:57 PM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Maybe ideal would be a format string based interface that calls C with
>> a format string and a record of values. Because what I really need is
>> to read/write records in an architecture independend way. Something
>> like
>>
>> type t = { x:int; y:char; z:int64 }
>> let t_format = "%2u%c%8d"
>>
>> put_formated buf t_format t
>>
>> But how to get that type safe? Maybe a camlp4 module that generates
>> the format string and type from a single declaration so they always
>> match.
>
> It's possibly off-topic, but you might be interested in Richard
> Jones's Bitstring project [1] wich deals with similar issues quite
> nicely in my opinion.
>
> [1] http://code.google.com/p/bitstring/

No, quite on-topic.

I glanced at the examples and code and it looks to me though as if
this can only parse bitstrings but not create them from a pattern.
You have

let parse_foo bits =
  bitmatch bits with
  | { x : 16 : littleendian; y : 16 : littleendian } -> fun x y -> (x, y)

but no

let unparse_foo (x, y) =
  bitmake { x : 16 : littleendian; y : 16 : littleendian } x y


Idealy would be something along

let pattern = make_pattern { x : 16 : littleendian; y : 16 : littleendian }
let parse_foo bits = parse pattern (fun x y -> (x, y))
let unparse_foo (x, y) = unparse pattern x y

But I know how to do that with CPS already. I just need the primitives
to get/set the basic types.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 17:57       ` [Caml-list] " Goswin von Brederlow
  2009-10-28 18:19         ` Sylvain Le Gall
@ 2009-10-28 22:48         ` blue storm
  2009-10-29  9:50           ` Goswin von Brederlow
  1 sibling, 1 reply; 23+ messages in thread
From: blue storm @ 2009-10-28 22:48 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Sylvain Le Gall, caml-list

On Wed, Oct 28, 2009 at 6:57 PM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Maybe ideal would be a format string based interface that calls C with
> a format string and a record of values. Because what I really need is
> to read/write records in an architecture independend way. Something
> like
>
> type t = { x:int; y:char; z:int64 }
> let t_format = "%2u%c%8d"
>
> put_formated buf t_format t
>
> But how to get that type safe? Maybe a camlp4 module that generates
> the format string and type from a single declaration so they always
> match.

It's possibly off-topic, but you might be interested in Richard
Jones's Bitstring project [1] wich deals with similar issues quite
nicely in my opinion.

[1] http://code.google.com/p/bitstring/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 18:19         ` Sylvain Le Gall
@ 2009-10-28 21:05           ` Goswin von Brederlow
  0 siblings, 0 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 21:05 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>
>>>> PS: Is a.{i} <- x a C call?
>>>
>>> Yes.
>>
>> That obviously sucks. I was hoping since the compiler has a special
>> syntax for it it would be built-in. Bigarray being a seperate module
>> should have clued me in.
>>
>> That obviously speaks against splitting int64 into 8 bytes and calling
>> a.{i} <- x for each.
>>
>> I think I will implement your method and C stubs for every set/get and
>> compare.
>
> This is only the case with int64 array in fact (I really have done test
> and you don't need a C call in most case).

Can I assume you tested on a 32bit cpu?

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 15:17     ` Sylvain Le Gall
@ 2009-10-28 17:57       ` Goswin von Brederlow
  2009-10-28 18:19         ` Sylvain Le Gall
  2009-10-28 22:48         ` blue storm
  0 siblings, 2 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 17:57 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>
>>> Hello,
>>>
>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>>> Hi,
>>>>
>>>
>>> Well, we talk about this a little bit, but here is my opinion:
>>> - calling a C function to add a single int will generate a big overhead
>>> - OCaml string are quite fast to modify values
>>>
>>> So to my mind the best option is to have a buffer string (say 16/32
>>> char) where you put data inside and flush it in a single C call to
>>> Bigarray. 
>>>
>>> E.g.:
>>> let append_char t c =
>>>   if t.idx >= 64 then
>>>     (
>>>       flush t.bigarray t.buffer;
>>>       t.idx <- 0
>>>     );
>>>   t.buffer.(t.idx) <- c;
>>>   t.idx <- t.idx + 1
>>>
>>> let append_little_uint16 t i =
>>>   append_char t ((i lsr 8) land 0xFF);
>>>   append_char t ((i lsr 0) land 0xFF)
>>>   
>>>
>>> I have used this kind of technique and it seems as fast as C, and a lot
>>> less C coding.
>>>
>>> Regards,
>>> Sylvain Le Gall
>>
>> This wont work so nicely:
>>
>> - Writes are not always in sequence. I want to do a stream access
>>   too where this could be verry effective. But the plain buffer is
>>   more for random / known offset access. At a minimum you would have
>>   holes for alignment.
>>
>> - It makes read/write buffers complicated as you need to flush or peek
>>   the string in case of uncommited changes. I can't do write-only
>>   buffers as I want to be able to write a buffer and then add a
>>   checksum to it in my application. The lib should not block that.
>>
>
> I was thinking to pure stream. It still stand with random access but you
> don't get a lot less C function call. You just have to write less C
> code.

set_uint8 buf 5 1 -> read in 64 byte from stream, skip to 5, set byte
set uint8 buf 100 1 -> write 64 byte, read other 64 byte, set byte

That can become real expensive.

>> I also still wonder how bad a C function call really is. Consider the
>> case of writing an int64.
>>
>> Directly: You get one C call that does range check, endian convert and
>> write in one go.
>>
>> Bffered: With your code you have 7 Int64 shifts, 8 Int64 lands, 8
>> conversions to int, at least one index check (more likely 8 to avoid
>> handling unaligned access) and 1/8 C call to blit the 64 byte buffer
>> string into the Bigarray.
>
> Not at all, you begin to break your int64 into 3 int (24bit * 2 + 16bit)
> and then 7 int shift, 8 int land. 
>
> You can even manage to only break into 1 or 2 int.
>
> And off course, you bypass index check. 

fun with unaligned writes.

>> PS: Is a.{i} <- x a C call?
>
> Yes.

That obviously sucks. I was hoping since the compiler has a special
syntax for it it would be built-in. Bigarray being a seperate module
should have clued me in.

That obviously speaks against splitting int64 into 8 bytes and calling
a.{i} <- x for each.

I think I will implement your method and C stubs for every set/get and
compare.

Maybe ideal would be a format string based interface that calls C with
a format string and a record of values. Because what I really need is
to read/write records in an architecture independend way. Something
like

type t = { x:int; y:char; z:int64 }
let t_format = "%2u%c%8d"

put_formated buf t_format t

But how to get that type safe? Maybe a camlp4 module that generates
the format string and type from a single declaration so they always
match.

> Regards,
> Sylvain Le Gall

MfG
        Goswin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 14:16 ` Sylvain Le Gall
@ 2009-10-28 15:00   ` Goswin von Brederlow
  2009-10-28 15:17     ` Sylvain Le Gall
  2009-10-29 20:40     ` Florian Weimer
  0 siblings, 2 replies; 23+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 15:00 UTC (permalink / raw)
  To: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> Hello,
>
> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Hi,
>>
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>>
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
>>
>> Now I define helper functions:
>>
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>>
>> But I want more:
>>
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>>
>> And endian correcting access for larger ints:
>>
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
>>
>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>>
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
>>
>> And last:
>>
>> get/set_string, blit_from/to_string
>>
>> Do I create a string where needed and then loop over every char
>> calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
>> memcpy?
>>
>> What do you think?
>>
>
> Well, we talk about this a little bit, but here is my opinion:
> - calling a C function to add a single int will generate a big overhead
> - OCaml string are quite fast to modify values
>
> So to my mind the best option is to have a buffer string (say 16/32
> char) where you put data inside and flush it in a single C call to
> Bigarray. 
>
> E.g.:
> let append_char t c =
>   if t.idx >= 64 then
>     (
>       flush t.bigarray t.buffer;
>       t.idx <- 0
>     );
>   t.buffer.(t.idx) <- c;
>   t.idx <- t.idx + 1
>
> let append_little_uint16 t i =
>   append_char t ((i lsr 8) land 0xFF);
>   append_char t ((i lsr 0) land 0xFF)
>   
>
> I have used this kind of technique and it seems as fast as C, and a lot
> less C coding.
>
> Regards,
> Sylvain Le Gall

This wont work so nicely:

- Writes are not always in sequence. I want to do a stream access
  too where this could be verry effective. But the plain buffer is
  more for random / known offset access. At a minimum you would have
  holes for alignment.

- It makes read/write buffers complicated as you need to flush or peek
  the string in case of uncommited changes. I can't do write-only
  buffers as I want to be able to write a buffer and then add a
  checksum to it in my application. The lib should not block that.

- The data is passed to libaio and needs to be kept alive and unmoved
  as long as libaio knows it. I was hoping I could use the pointer to
  the data to register/unregister GC roots without having to add a
  another custom header and indirections.


I also still wonder how bad a C function call really is. Consider the
case of writing an int64.

Directly: You get one C call that does range check, endian convert and
write in one go.

Bffered: With your code you have 7 Int64 shifts, 8 Int64 lands, 8
conversions to int, at least one index check (more likely 8 to avoid
handling unaligned access) and 1/8 C call to blit the 64 byte buffer
string into the Bigarray.

MfG
        Goswin

PS: Is a.{i} <- x a C call?


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-11-03 17:12 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-03 17:16 [Caml-list] Re: How to read different ints from a Bigarray? Charles Forsyth
  -- strict thread matches above, loose matches on Subject: below --
2009-10-28 13:54 Goswin von Brederlow
2009-10-28 14:16 ` Sylvain Le Gall
2009-10-28 15:00   ` [Caml-list] " Goswin von Brederlow
2009-10-28 15:17     ` Sylvain Le Gall
2009-10-28 17:57       ` [Caml-list] " Goswin von Brederlow
2009-10-28 18:19         ` Sylvain Le Gall
2009-10-28 21:05           ` [Caml-list] " Goswin von Brederlow
2009-10-28 22:48         ` blue storm
2009-10-29  9:50           ` Goswin von Brederlow
2009-10-29 10:34             ` Goswin von Brederlow
2009-10-29 12:20             ` Richard Jones
2009-10-29 17:07               ` Goswin von Brederlow
2009-10-30 20:30                 ` Richard Jones
2009-11-01 15:11                   ` Goswin von Brederlow
2009-11-01 19:57                     ` Richard Jones
2009-11-02 16:11                       ` Goswin von Brederlow
2009-11-02 16:33                         ` Mauricio Fernandez
2009-11-02 20:27                           ` Richard Jones
2009-11-03 13:18                             ` Goswin von Brederlow
2009-11-02 20:48                           ` Goswin von Brederlow
2009-10-29 20:40     ` Florian Weimer
2009-10-29 21:04       ` Gerd Stolpmann
2009-10-29 23:43         ` Goswin von Brederlow
2009-10-30  0:48           ` Gerd Stolpmann
2009-10-29 23:38       ` Goswin von Brederlow
2009-10-28 17:09 ` [Caml-list] " Xavier Leroy
2009-10-29 17:05   ` Goswin von Brederlow
2009-10-29 18:48     ` Sylvain Le Gall
2009-10-29 23:25       ` [Caml-list] " Goswin von Brederlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox