* [Caml-list] c is 4 times faster than ocaml? @ 2004-08-04 2:39 effbiae 2004-08-04 4:59 ` John Prevost 0 siblings, 1 reply; 11+ messages in thread From: effbiae @ 2004-08-04 2:39 UTC (permalink / raw) To: caml-list hello, my first post to the list. not intended to be inflammatory or to generate ill feeling in any way. :) i am evaluating languages for implementing a fast dbms. i would like to use a 'higher level' language without resorting to portable assembler. ocaml looks really nice, and it drew my attention in doug's language shootout: http://www.bagley.org/~doug/shootout/craps.shtml and i have noticed that it is used to win programming contests -- indeed the language for a discriminating hacker! it was with great hope that i started on my first benchmark -- testing what all fast dbmses use: mmap. after a bit of searching, i found that Bigarray was the way to go (short of writing my own C extension). the benchmark sources in c and ocaml are appended, along with the Makefile. in summary, on my Mandrake 10 PIII 500 system, i get these timings: $ time -p ./cbs 26 (* the C version *) real 1.06 user 0.54 sys 0.51 $ time -p ./ocbs 26 (* the O'Caml version *) real 2.95 user 2.39 sys 0.51 the real time can vary a bit due to different states of cache, but user and sys remain fairly constant. the real time is not significant for my purposes because the dbms will not be IO bound for most of it's queries. so there you have it! i would really like to be able to optimise the ocaml benchmark to be within 10% of C. i have read a post by John Prevost "mmap for O'Caml" in which he implies he wrote mmap primitives but not using the O'Caml-C interface. what does he mean? i assume Bigarray is written in the fastest possible way -- or is there a faster way? also note that i'll need msync, so i will need to extend O'Caml in some way regardless (unless there's some library out there for mmap that i haven't discovered). any help greatly appreciated, jack. $ cat Makefile oc: ocamlopt -unsafe -inline 2 bigarray.cmxa unix.cmxa -o ocbs bs.ml c: gcc -O3 -o cbs bs.c $ cat bs.ml let f x y z = x + y + z;; let g x = function y -> function z -> f x y z;; let h x = let k=1 in function y -> f x y k;; let mapit = let k=(-1) in function ty -> function fd -> Bigarray.Array1.map_file fd ty Bigarray.c_layout true k;; let maprwbs=mapit Bigarray.int8_unsigned;; if Array.length Sys.argv = 2 then begin let p=int_of_string Sys.argv.(1) and fn=Sys.argv.(0) ^ ".bs" in let fd=Unix.openfile fn [Unix.O_RDWR;Unix.O_CREAT;Unix.O_TRUNC] 0o640 and n=1 lsl p in let _=Unix.lseek fd (n-1) Unix.SEEK_SET and _=Unix.write fd "\000" 0 1 and _=assert (Unix.lseek fd 0 Unix.SEEK_END == n) and ar=mapit Bigarray.int8_unsigned fd in let _=for i=0 to n-1 do ar.{i} <- i done and odds=ref 0 in for i=0 to n-1 do if ar.{i} land 1 = 1 then odds:=!odds+1 done end else begin print_endline "Usage: bs <power-of-2>" end;; $ cat bs.c #include <stdlib.h> #include <stdio.h> #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define CHKact(x,act) do \ if(!(x)){fprintf(stderr,"!CHK (%s:%d)\n",__FILE__,__LINE__);act;} while(0) #define CHK(x) CHKact(x,return -1) #define CHKp(x) CHKact(x,perror(0);return -1) main(int argc,char**argv) {if(argc==2) {char fn[1024];CHK(sprintf(fn,"%s.bs",argv[0]));int p=atoi(argv[1]); int fd;CHKp(-1!=(fd=open(fn,O_RDWR|O_CREAT|O_TRUNC,S_IRUSR|S_IWUSR))); int n=1<<p;lseek(fd,n-1,SEEK_SET);int zero=0;CHKp(write(fd,&zero,1)==1); CHKp(lseek(fd,0,SEEK_END)==n);unsigned char*ar; CHKp(-1!=(int)(ar=mmap(0,n,PROT_READ|PROT_WRITE,MAP_FILE|MAP_SHARED,fd,0))); int i;for(i=0;i<n;++i)ar[i]=i; int odds=0;for(i=0;i<n;++i)if(ar[i]&1)odds++; CHKp(!munmap(ar,n)); }else fprintf(stderr, "Usage: %s <power-of-2>\n",argv[0]); } ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml? 2004-08-04 2:39 [Caml-list] c is 4 times faster than ocaml? effbiae @ 2004-08-04 4:59 ` John Prevost 2004-08-04 5:05 ` John Prevost 2004-08-04 5:24 ` effbiae 0 siblings, 2 replies; 11+ messages in thread From: John Prevost @ 2004-08-04 4:59 UTC (permalink / raw) To: effbiae; +Cc: caml-list Just from a first look, I'd say that there are two likely reasons that this artificial (and incredibly hard to read) benchmark program performs poorly: First, even with -unsafe, bounds checking is performed on BigArray types. Second, working with single byte values is likely painful here--O'Caml always works with word-aligned values, so it's going to lose bigtime. gcc, on the other hand, knows that the crazy intel instruction set can handle non-word-aligned values. Here's the main setting loop from gcc: .L84: movb %al, (%eax,%ecx) incl %eax cmpl %edx, %eax jl .L84 And here it is from O'Caml: .L109: movl %esi, %ecx ;; grab our index into %ecx sarl $1, %ecx ;; shift off the tag bit movl 20(%eax), %edx ;; get the array's length into %edx cmpl %ecx, %edx ;; compare the two jbe .L111 ;; if the index is too high, punt movl 4(%eax), %edi ;; ? probably figure which byte in word movl %esi, %edx ;; load the loop value into %edx sarl $1, %edx ;; shift off the low bit movb %dl, (%edi, %ecx) ;; shove %edx's byte into the word movl %esi, %ecx ;; store back into array addl $2, %esi ;; add 1 to index cmpl %ebx, %ecx ;; compare to target jne .L109 ;; not equal? loop That jbe .L111 is what happens if a bounds check fails, by the way! Anyway, you can see that the bounds check takes a bunch of instructions. THe main loop is also a bit more expensive. One thing going on is those "sarl" instructions, which are shifting out the tag bit on the right end of O'Caml integers. If you were working on integers instead, I think it might be less painful. Especially if you could use int32s held in registers to index into things. Anyway, the main two things slowing stuff down here are the bounds check and the fact that O'Caml needs to do so much work turning caml integers into c integers. (Just as a note, I accidentally tweaked your file to make the loops not know the type of their arguments at one point while looking for this loop--you *always* want exact types known at a deep level for this kind of thing, as that made ocamlopt use C calls to access the array.) Oh--and ignore my old ramblings on mmap stuff. That code was bad then, and is worse now. :) As for your project, I suspect we could provide better suggestions on how to optimize if we were looking at real code. My suspicion is that you might want to write one or two low-level routines in C, rather than using Bigarrays for this task. (Just assuming, though--from the sound of it you're going to have larger structured data in the mmap'd areas.) Good luck! ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml? 2004-08-04 4:59 ` John Prevost @ 2004-08-04 5:05 ` John Prevost 2004-08-04 5:24 ` effbiae 1 sibling, 0 replies; 11+ messages in thread From: John Prevost @ 2004-08-04 5:05 UTC (permalink / raw) To: caml-list; +Cc: caml-list Oh, one last parting thought: Part of why gcc is winning here is that it's actively working with the fact that the loop variable, the index, and the value to be inserted are the same value. O'Caml is doing extra work because it's not linking them up (it could at the very least avoid shifting a few registers around and avoid an extra sarl instruction if it did spot that.) But this is the trouble with artificial benchmarks: no real code is simply going to be copying the loop value into the array. It's going to be fetching the value from somewhere else, probably by doing pointer arithmetic on the loop value and the source address, then it will do pointer arithmetic on the loop value and the destination address. Then it will set the result. A smart C coder will do the arithmetic ahead of time, which means incrementing two values instead of one each time through the loop, but wins overall. Anyway, the short is: artificial benchmarks are bad. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml? 2004-08-04 4:59 ` John Prevost 2004-08-04 5:05 ` John Prevost @ 2004-08-04 5:24 ` effbiae 2004-08-04 7:28 ` John Prevost 1 sibling, 1 reply; 11+ messages in thread From: effbiae @ 2004-08-04 5:24 UTC (permalink / raw) To: John Prevost; +Cc: caml-list oooh - a gmail account :) > this artificial (and incredibly hard to read) benchmark program was the C hard to read or the O'Caml? Any style tips for my caml? > First, even with -unsafe, bounds checking is performed on BigArray > types. if i write a c extension that mmaps and msyncs then will the vector element assignment become a call rather than a movb (or movl)? that is, is Bigarray a 'special' c extension that ocaml knows how to optimize and access just like C or is it a c extension that i can model my C extension code on? > If you were working on integers instead, I think it might be less > painful. Especially if you could use int32s held in registers > to index into things. can i specify that an int32 is held in a register or does the compiler do this? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml? 2004-08-04 5:24 ` effbiae @ 2004-08-04 7:28 ` John Prevost 2004-08-04 8:18 ` [Caml-list] " Jack Andrews 0 siblings, 1 reply; 11+ messages in thread From: John Prevost @ 2004-08-04 7:28 UTC (permalink / raw) To: caml-list On Wed, 4 Aug 2004 15:24:28 +1000 (EST), effbiae@ivorykite.com <effbiae@ivorykite.com> wrote: > was the C hard to read or the O'Caml? Any style tips for my caml? Mmm. They were both pretty blinding. For simple Caml style, read some code that's around. There's bad style and good style and inbetween style, but it all pretty much works. > if i write a c extension that mmaps and msyncs then will the vector > element assignment become a call rather than a movb (or movl)? that is, > is Bigarray a 'special' c extension that ocaml knows how to optimize and > access just like C or is it a c extension that i can model my C extension > code on? The basic idea is that you would take something that you might otherwise do as a long sequence of calls and turn it into a single call. For example, if you're interested in blitting strings (which are essentially byte arrays) into a Bigarray containing bytes, you might write a C function that checks the bounds one time, converts the O'Caml integers to native C integers one time, and then just does the fastest memory copy it can. This will turn into a function call, but since the main idea is mainly just to amortize the necessary overhead across a larger amount of data, it should be preferable. > can i specify that an int32 is held in a register or does the compiler do > this? I would expect (and I may be mistaken) that if you have an int32 that is scoped to just a given function or loop, you can expect it to go into a register (if there are enough registers to go around.) Or, for example, when you have a single expression (Int32.add 5l (Int32.mul x 3l)) it's not going to be allocating a box for all of those constants, nor for an intermediate result. When in doubt, try it and take a look at the assembly file from ocamlopt -S to get a feel for how things work. Note that I would generally recommend that you only go to these lengths when you know it's going to be an issue. And only after you've actual evidence that the system is indeed not fast enough. Your chosen testcase has more necessary overhead than most, mainly because it's interacting heavily with a datastructure *meant* to interoperate with C. On the whole, ocamlopt produces binaries that are very fast. Just remember that it does best when you write things in the most natural way for this language, and that learning what's natural in O'Caml will take a little exposure. John. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Caml-list] Re: c is 4 times faster than ocaml? 2004-08-04 7:28 ` John Prevost @ 2004-08-04 8:18 ` Jack Andrews 2004-08-04 10:06 ` Mikhail Fedotov 0 siblings, 1 reply; 11+ messages in thread From: Jack Andrews @ 2004-08-04 8:18 UTC (permalink / raw) To: John Prevost; +Cc: caml-list John Prevost said: >> was the C hard to read or the O'Caml? Any style tips for my caml? > > Mmm. They were both pretty blinding. my c style is inspired by arthur whitney of kx.com. he is a genius. his language, k, is superquick. it's an APL dialect. he's written kdb in k, and it goes like the clappers. the most impressive thing is that k comes in at <100Kb and kdb <50Kb. he's a genius. > The basic idea is that you would take something that you might > otherwise do as a long sequence of calls and turn it into a single > call. yeah, i'm familiar with the pattern. basically, i want to write my dbms core in ocaml -- my only other option at the moment is c. i have to say that looking at the -S output i am given great hope that ocaml has got what it takes. i thought i'd never find a functional language that was fast, but i always believed it was possible to write a fast compiler for one! (i was brought up on miranda and prolog) > ... but > since the main idea is mainly just to amortize the necessary overhead > across a larger amount of data, it should be preferable. the only interface where such amortizing could occur is the API to the database core, but i want to write the core in ocaml and i think it's possible (see thread 'what is this magic?') > Your chosen testcase has more necessary overhead than most, mainly > because it's interacting heavily with a datastructure *meant* to > interoperate with C. you mean ocaml is not a suitable language for developing a dbms? thanks, jack ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Re: c is 4 times faster than ocaml? 2004-08-04 8:18 ` [Caml-list] " Jack Andrews @ 2004-08-04 10:06 ` Mikhail Fedotov 2004-08-04 10:25 ` [Caml-list] " Jack Andrews 0 siblings, 1 reply; 11+ messages in thread From: Mikhail Fedotov @ 2004-08-04 10:06 UTC (permalink / raw) To: John Prevost; +Cc: caml-list Jack Andrews wrote: >yeah, i'm familiar with the pattern. basically, i want to write my dbms >core in ocaml -- my only other option at the moment is c. i have to say > > Out of curiosity, why you don't want to use the exiting ones with c/ocaml mappings - sqlite, for instance ? Mikhail ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml? 2004-08-04 10:06 ` Mikhail Fedotov @ 2004-08-04 10:25 ` Jack Andrews 2004-08-04 15:38 ` [Caml-list] custom mmap modeled on bigarray Jack Andrews 0 siblings, 1 reply; 11+ messages in thread From: Jack Andrews @ 2004-08-04 10:25 UTC (permalink / raw) To: Mikhail Fedotov; +Cc: caml-list Mikhail Fedotov said: > Out of curiosity, why you don't want to use the exiting ones with > c/ocaml mappings - sqlite, for instance ? the existing ones aren't that good at storing and quering a terabyte. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Caml-list] custom mmap modeled on bigarray 2004-08-04 10:25 ` [Caml-list] " Jack Andrews @ 2004-08-04 15:38 ` Jack Andrews 2004-08-10 5:06 ` Jack Andrews 0 siblings, 1 reply; 11+ messages in thread From: Jack Andrews @ 2004-08-04 15:38 UTC (permalink / raw) To: caml-list sorry if i was at all obnoxious earlier -- i'm a bit manic at the moment. with that in mind... i've written a prototype C library for mmapping files and viewing the map as an array of ints. i (optimistically) tried substituting my get/set externals with the inlineable "%bigarray_ref_1" and "%bigarray_set_1" with the expected result -- bang! is there a way to hack this so %bigarray_get_1 will work? do i need to layout my custom struct in a particular way? without further ado, here is the mm.ml program, Makefile and mm.c native code. (i've made my C code look more conventional than my last post) $ cat mm.ml external init : unit -> unit = "mm_init" let _ = init() module Mm = struct type 'a t external create: string -> int -> 'a t = "mm_create" external mopen: string -> 'a t = "mm_open" external resize: 'a t -> int -> unit = "mm_resize" external sync: 'a t -> unit = "mm_sync" (* here's the slow get/set..... *) (* note that 'a is always int for test purposes *) external slow_get: 'a t -> int -> 'a = "mm_get_int" external slow_set: 'a t -> 'a -> int -> unit = "mm_set_int" (********************************************************) (* ... and here's the optimistic experiment *) external get: 'a t -> int -> 'a = "%bigarray_ref_1" external set: 'a t -> 'a -> int -> unit = "%bigarray_set_1" (********************************************************) end;; (* this program crashes after successful completion and before the finalizer for mm is called...? *) let mm=Mm.create "tmp1" 1024 in for i=0 to 200 do Mm.slow_set mm i i done; Mm.sync mm;; let mm=Mm.mopen "tmp1" in print_string "expecting eleven, got ";print_int (Mm.slow_get mm 11); print_newline();; let mm=Mm.mopen "tmp1" and odds=ref 0 in for i=0 to 200 do if (Mm.slow_get mm i) land 1 = 1 then odds:=!odds+1 done; print_string "number of odds: ";print_int !odds;print_newline();; let optimistic=false in if optimistic then begin let mm=Mm.mopen "tmp1" and odds=ref 0 in for i=0 to 200 do if (Mm.get mm i) land 1 = 1 then odds:=!odds+1 done; let mm=Mm.mopen "tmp2" in for i=0 to 200 do Mm.set mm i i done; Mm.sync mm; end;; $ cat mm.c #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <errno.h> #include <caml/alloc.h> #include <caml/custom.h> #include <caml/fail.h> #include <caml/intext.h> #include <caml/memory.h> #include <caml/mlvalues.h> /******************************************************\ ** CHK(mm) is error handler - calls failwith ** ** CHKp(mm) appends system error from strerror ** \******************************************************/ #define CHKbase(x,y) do { if(!(int)(x))\ {sprintf(err,"!CHK (%s:%d): %s",__FILE__,__LINE__,#x);\ if(y)sprintf(err+strlen(err),"\n%s",strerror(errno));\ failwith(err);\ }} while(0) #define CHK(x) CHKbase(x,0) #define CHKp(x) CHKbase(x,1) static char* err;static long errz;/* error buf and buf size */ /*page size and mask for aligning in mmap*/ static size_t page_size;static unsigned long page_mask; /* MMP is the structure of the file: a UL header specifying length ** of array, followed by the array */ typedef struct _MMP{ unsigned long n;/* number of bytes in ar */ int ar[];/* the data in the array to be cast to any type */ }*MMP; typedef struct _MM { void* data; /* a copy of mmp->ar for use as bigarray */ char* filename; long fd; /* file descriptor */ MMP mmp; /* the return value of mmap is stored here */ unsigned long fz;/* the size of the file (will be multiples of page_size) */ }*MM; /******************************************************\ ** Here are the C functions that are glued to ocaml ** \******************************************************/ MM nMM() /* malloc a MM -- only for use in C test code */ {MM mm=malloc(sizeof(struct _MM)); memset(mm,0,sizeof(struct _MM)); return mm; } int mapMM(MM mm,unsigned long n) /*used by create, open and realloc*/ {unsigned long fz=(n+sizeof(unsigned long)+(page_size-1))&page_mask; CHK(fz>0); if(fz>mm->fz) {unsigned long lsk,zero=0; CHKp((lsk=lseek(mm->fd,fz-1,SEEK_SET))==fz-1); CHKp(1==write(mm->fd,&zero,1));CHK(lseek(mm->fd,0,SEEK_CUR)==fz); } void* p; p=mmap(0,fz,PROT_READ|PROT_WRITE,MAP_FILE|MAP_SHARED,mm->fd,0); CHKp(p&&-1!=(long)p); mm->mmp=p; mm->mmp->n=n; mm->data=mm->mmp->ar; mm->fz=fz; return n; } /* mode in {'c':create,'o':open}*/ int iniMM(MM mm,const char* filename,char mode,unsigned long size) {int open_mode=O_RDWR,permissions=S_IRUSR|S_IWUSR; mm->filename=strdup(filename); switch(mode) {case 'c': open_mode|=O_CREAT; CHKp(-1!=(mm->fd=open(filename,open_mode,permissions))); return mapMM(mm,size); case 'o': {FILE *fp;CHK(fp=fopen(filename,"r")); unsigned long this_size; CHK(1==fread(&this_size,sizeof(unsigned long),1,fp)); CHK(!fclose(fp)); CHKp(-1!=(mm->fd=open(filename,open_mode))); CHK(((mm->fz=lseek(mm->fd,0,SEEK_END))&page_mask)==mm->fz); CHK(this_size+sizeof(unsigned long)<=mm->fz); return mapMM(mm,this_size);/*ignore size - use size from head of file*/ }}return -1; } int unMM(MM mm) {unsigned long oldn=mm->mmp->n;CHK(mm->fz); CHKp(!munmap(mm->mmp,mm->fz)); FILE *fp;CHKp(fp=fopen(mm->filename,"r")); unsigned long n;CHK(1==fread(&n,sizeof(unsigned long),1,fp)); CHK(!fclose(fp));CHK(n==oldn);return 0; } int finalizeMM(MM mm) {CHK(unMM(mm)); CHK(!close(mm->fd)); free(mm->filename); return 0; } int resizeMM(MM mm,unsigned long n) {if(n>(mm->fz-sizeof(unsigned long))) {unMM(mm); CHK(0<mapMM(mm,n)); } mm->mmp->n=n; return n; } int syncMM(MM mm) {CHKp(!msync(mm->mmp,mm->fz,MS_SYNC)); return 0; } /******************************************************\ ** Here's the ocaml interface ** \******************************************************/ #define v2mm MM mm=(MM)Data_custom_val(vmm) static void mm_finalize(value vmm) {v2mm; printf("mm_finalize\n"); finalizeMM(mm); } static struct custom_operations mm_ops = { "mm", mm_finalize, custom_compare_default, custom_hash_default, custom_serialize_default, custom_deserialize_default }; CAMLprim value mm_init(value unit) /*must be called before use*/ {register_custom_operations(&mm_ops); page_size = (size_t) sysconf (_SC_PAGESIZE); page_mask=0;unsigned long ps=page_size; unsigned long pbit;for(pbit=0;!(ps&1);pbit++)ps>>=1; page_mask=(((unsigned long)-1)>>pbit)<<pbit; errz=1024;err=malloc(errz);if(!err)failwith("unable to alloc error buffer"); return Val_unit; } value mm_ini(value vfn,char mode,value vsize) {value vmm=alloc_custom(&mm_ops,sizeof(struct _MM),1,100); v2mm; CHK(mode=='c'||mode=='o'); {unsigned long size=-1; if(mode=='c') size=Long_val(vsize)*sizeof(unsigned long); iniMM(mm,String_val(vfn),mode,size); } return vmm; } CAMLprim value mm_create(value vfn, value vsize) /* see iniMM 'c' */ {return mm_ini(vfn,'c',vsize); } CAMLprim value mm_open(value vfn) /* see iniMM 'o' */ {return mm_ini(vfn,'o',-1); } CAMLprim value mm_resize(value vmm, value vsize) /* see resizeMM */ {v2mm; resizeMM(mm,Val_long(vsize)); return Val_unit; } CAMLprim value mm_sync(value vmm) /* see syncMM */ {v2mm; syncMM(mm); return Val_unit; } /******************************************************\ ** ** ** Here's where we want bigarray_ref_1 instead ** ** ** \******************************************************/ CAMLprim value mm_get_int(value vmm, value vind) {long*ar=((MM)vmm)->data;return Val_int(ar[Int_val(vind)]); } /******************************************************\ ** ** ** Here's where we want bigarray_set_1 instead ** ** ** \******************************************************/ CAMLprim value mm_set_int(value vmm, value vind, value newval) {long*ar=((MM)vmm)->data;ar[Int_val(vind)]=Int_val(newval);return Val_unit; } $ cat Makefile all: ocamlc mm.c ocamlc -custom mm.o bigarray.cma mm.ml -o mm ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] custom mmap modeled on bigarray 2004-08-04 15:38 ` [Caml-list] custom mmap modeled on bigarray Jack Andrews @ 2004-08-10 5:06 ` Jack Andrews 2004-08-11 14:52 ` Eric Stokes 0 siblings, 1 reply; 11+ messages in thread From: Jack Andrews @ 2004-08-10 5:06 UTC (permalink / raw) To: caml-list i know the argument for developing first, optimizing later. i also know the arguments for not caring about fine-grain performance and to look at the big picture. i've argued both and seen where these arguments fail. consider compressed data as a bit stream from disk. say it has simple encoding: phrase := byte:<number-of-bits>, byte:<number-of-values>, int[]:<bit-stream> eg: 0x03 0x0a 0b1110 0011 1001 0100 1110 0101 1101 1100 | | +<bit-stream> | +number-of-values +number-of-bits represents the sequence of 10 3-bit numbers: 111,000,111,001,010,011,100,101,110,111 now consider a sentence as sentence := <empty> | phrase, sentence without an enhanced FFI, ocaml will be considerably slower than C for uncompressing (and compressing). in my previous post, i suggest that some language primitives similar to %bigarray_ref_1 could be introduced to make ocaml comparable to C. i have investigated this possibility, and my suggestion is that %bigarray_ref is replaced by a primitive %ffi_ref and made public. then bigarray can be built on the more general %ffi_ref and developers have a fast means of accessing C arrays like mmap regions. if i spend time implementing %ffi_ref/set, is there any chance of it being incorporated into ocaml? thanks, jack ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] custom mmap modeled on bigarray 2004-08-10 5:06 ` Jack Andrews @ 2004-08-11 14:52 ` Eric Stokes 0 siblings, 0 replies; 11+ messages in thread From: Eric Stokes @ 2004-08-11 14:52 UTC (permalink / raw) To: effbiae; +Cc: caml-list Even if it isn't it can be distributed as a GODI patch fairly easily. I think a patch such as this, if architected well would allow me to improve performance of some of my C bindings even more, and that is a good thing. On Aug 9, 2004, at 10:06 PM, Jack Andrews wrote: > i know the argument for developing first, optimizing later. i also > know > the arguments for not caring about fine-grain performance and to look > at the big picture. i've argued both and seen where these arguments > fail. > > consider compressed data as a bit stream from disk. > say it has simple encoding: > phrase := > byte:<number-of-bits>, byte:<number-of-values>, int[]:<bit-stream> > eg: > 0x03 0x0a 0b1110 0011 1001 0100 1110 0101 1101 1100 > | | +<bit-stream> > | +number-of-values > +number-of-bits > > represents the sequence of 10 3-bit numbers: > > 111,000,111,001,010,011,100,101,110,111 > > now consider a sentence as > sentence := <empty> | phrase, sentence > > without an enhanced FFI, ocaml will be considerably slower than C for > uncompressing (and compressing). > > in my previous post, i suggest that some language primitives similar to > %bigarray_ref_1 could be introduced to make ocaml comparable to C. i > have investigated this possibility, and my suggestion is that > %bigarray_ref is replaced by a primitive %ffi_ref and made public. > then bigarray can be built on the more general %ffi_ref and developers > have a fast means of accessing C arrays like mmap regions. > > if i spend time implementing %ffi_ref/set, is there any chance of it > being > incorporated into ocaml? > > thanks, > > > > jack > > > ------------------- > To unsubscribe, mail caml-list-request@inria.fr Archives: > http://caml.inria.fr > Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: > http://caml.inria.fr/FAQ/ > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2004-08-11 14:52 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-08-04 2:39 [Caml-list] c is 4 times faster than ocaml? effbiae 2004-08-04 4:59 ` John Prevost 2004-08-04 5:05 ` John Prevost 2004-08-04 5:24 ` effbiae 2004-08-04 7:28 ` John Prevost 2004-08-04 8:18 ` [Caml-list] " Jack Andrews 2004-08-04 10:06 ` Mikhail Fedotov 2004-08-04 10:25 ` [Caml-list] " Jack Andrews 2004-08-04 15:38 ` [Caml-list] custom mmap modeled on bigarray Jack Andrews 2004-08-10 5:06 ` Jack Andrews 2004-08-11 14:52 ` Eric Stokes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox