Re: Unsigned integers?

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* Re: Unsigned integers?
@ 2000-03-23 19:42 Damien Doligez
  0 siblings, 0 replies; 14+ messages in thread
From: Damien Doligez @ 2000-03-23 19:42 UTC (permalink / raw)
  To: caml-list; +Cc: maxs

>From: Max Skaller <maxs@in.ot.com.au>

>I would be happy to replace, in this code,
>evey use of 'lor', 'land', + - * < etc with
>'ulor' 'uland' 'uplus' 'uminus' 'uless' etc, if only
>I could define them. (I could do this in C .. but then,
>I could write the below routines in C too)

For ulor, uland, uplus, uminus, umult, as well as lsr and lsl, they
are identical to their signed counterparts, so you don't need to do
anything.

For uless, since you are only ever comparing to a positive constant
less than max_int, I suggest replacing "if i < constant" with
"if 0 <= i && i < constant".

>Note these operations MUST be extremely fast,

If the above works, I doubt you can go any faster.  For more complex
code, you may have to use a full-blown unsigned comparison:
(not tested; could be wrong)

  let uless x y = if (x < 0) = (y < 0) then x < y else x > y;;

The only difficulty would be with division and modulo, as noted by
Xavier, but I gather you don't need them for this application.

-- Damien

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-24  2:50         ` Jacques Garrigue
  2000-03-24 15:59           ` Xavier Leroy
@ 2000-03-25  4:03           ` John Max Skaller
  1 sibling, 0 replies; 14+ messages in thread
From: John Max Skaller @ 2000-03-25  4:03 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: maxs, caml-list

Jacques Garrigue wrote:
 
> If compact storage is the problem, ocaml 3.00 also provides bigarrays,
> which allow you to store int32 values in flat arrays (even
> multidimensional).

BTW: all these new 'specialisations' for generic constructions
just shows that C++ isn't so bad after all. :-)

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-24  2:50         ` Jacques Garrigue
@ 2000-03-24 15:59           ` Xavier Leroy
  2000-03-25  4:03           ` John Max Skaller
  1 sibling, 0 replies; 14+ messages in thread
From: Xavier Leroy @ 2000-03-24 15:59 UTC (permalink / raw)
  To: Jacques Garrigue, maxs; +Cc: caml-list

> By the way, is there any plan to do for int32 the same kind of
> optimizations as are done for floats (no boxing/unboxing in the middle
> of a computation)? Already done?

Already done!  For int32, nativeint, and even for int64 on 64-bit
processors.

The only difference between boxed integers and floats, as far as
boxing elimination in ocamlopt goes, is that there is a hack to unbox
floats in arrays, but no corresponding hack for arrays of boxed
integers.  As Jacques said, the new Bigarray module does provide
arrays of unboxed int32 / nativeint / int64, although of a different
type than the standard Caml arrays.

- Xavier Leroy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-23  2:08       ` Max Skaller
  2000-03-23  7:50         ` Sven LUTHER
  2000-03-24  2:50         ` Jacques Garrigue
@ 2000-03-24 14:50         ` Xavier Leroy
  2 siblings, 0 replies; 14+ messages in thread
From: Xavier Leroy @ 2000-03-24 14:50 UTC (permalink / raw)
  To: Max Skaller, caml-list

> The code is below. The code works for values <2^30,
> but fails when and int goes negative.

It is easy to fix this.  The "parse_utf8" function needs not be
modified.  For "utf8_of_int", just replace all tests i < CST by
i >= 0 && i < CST, e.g.

> let utf8_of_int i =
>   let chr x = String.make 1 (Char.chr x) in
>   if i >= 0 && i < 0x80 then 
>      chr(i)
>   else if i >= 0 && i < 0x800 then 
>      chr(0xC0 lor ((i lsr 6) land 0x1F))  ^
>       chr(0x80 lor (i land 0x3F))
>   else if i >= 0 && i < 0x10000 then 
>      chr(0xE0 lor ((i lsr 12) land 0xF)) ^
>       chr(0x80 lor ((i lsr 6) land 0x3F)) ^
>       chr(0x80 lor (i land 0x3F))
>   else if i >= 0 && i < 0x200000 then 
>      chr(0xF0 lor ((i lsr 18) land 0x7)) ^
>       chr(0x80 lor ((i lsr 12) land 0x3F)) ^
>       chr(0x80 lor ((i lsr 6) land 0x3F)) ^
>       chr(0x80 lor (i land 0x3F))
>   else if i >= 0 && i < 0x4000000 then 
>      chr(0xF8 lor ((i lsr 24) land 0x3)) ^
>       chr(0x80 lor ((i lsr 18) land 0x3F)) ^
>       chr(0x80 lor ((i lsr 12) land 0x3F)) ^
>       chr(0x80 lor ((i lsr 6) land 0x3F)) ^
>       chr(0x80 lor (i land 0x3F))
>   else chr(0xFC lor ((i lsr 30) land 0x1)) ^
>     chr(0x80 lor ((i lsr 24) land 0x3F)) ^
>     chr(0x80 lor ((i lsr 18) land 0x3F)) ^
>     chr(0x80 lor ((i lsr 12) land 0x3F)) ^
>     chr(0x80 lor ((i lsr 6) land 0x3F)) ^
>     chr(0x80 lor (i land 0x3F))

or special-case i < 0 immediately and treat it as in the last "else"
clause.

> Note these operations MUST be extremely fast,
> and in particular, compact storage of ISO-10646
> code points in arrays of integers is OK,
> while arrays of boxed values is out of the question.
> (So I can't use int32).

If they MUST be extremely fast, you'd rather avoid the repeated "^"
operations and allocate and fill the resulting string directly, e.g.

>   else if i >= 0 && i < 0x800 then begin
        let res = String.create 2 in
        res.[0] <- chr(0xC0 lor ((i lsr 6) land 0x1F));
        res.[1] <- chr(0x80 lor (i land 0x3F));
        res
    end else ...

Hope this helps,

- Xavier Leroy



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-23  2:08       ` Max Skaller
  2000-03-23  7:50         ` Sven LUTHER
@ 2000-03-24  2:50         ` Jacques Garrigue
  2000-03-24 15:59           ` Xavier Leroy
  2000-03-25  4:03           ` John Max Skaller
  2000-03-24 14:50         ` Xavier Leroy
  2 siblings, 2 replies; 14+ messages in thread
From: Jacques Garrigue @ 2000-03-24  2:50 UTC (permalink / raw)
  To: maxs; +Cc: caml-list

From: Max Skaller <maxs@in.ot.com.au>

> Note these operations MUST be extremely fast,
> and in particular, compact storage of ISO-10646
> code points in arrays of integers is OK,
> while arrays of boxed values is out of the question.
> (So I can't use int32).

If compact storage is the problem, ocaml 3.00 also provides bigarrays,
which allow you to store int32 values in flat arrays (even
multidimensional).
For the cost of boxing/unboxing in int32 computations, you will
probably have to test whether it meets your needs or not.

By the way, is there any plan to do for int32 the same kind of
optimizations as are done for floats (no boxing/unboxing in the middle
of a computation)? Already done?

---------------------------------------------------------------------------
Jacques Garrigue      Kyoto University     garrigue at kurims.kyoto-u.ac.jp
		<A HREF=http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/>JG</A>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-22 19:47     ` Xavier Leroy
@ 2000-03-23 12:55       ` John Max Skaller
  0 siblings, 0 replies; 14+ messages in thread
From: John Max Skaller @ 2000-03-23 12:55 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

Xavier Leroy wrote:
> 
> > I have some code for processing ISO-10646 characters and UTF-8,
> > which uses caml integers. ISO-10646 has 2^31 code points, which
> > can be covered by caml integers on a 32bit machine. Using an
> > unboxed type is mandatory for performance.
> 
> OCaml 3.00 includes three new library modules, Int32, Int64 and
> Nativeint, implementing (boxed) 32-bit, 64-bit and platform-native
> integers, resepctively.  (Platform-native integers are 32 bits on 32
> bit processors and 64 bits on 64 bit processors).  The native-code
> compiler was modified to inline the operations on those types,
> including elimination of unnecessary boxing/unboxing, like for floats.
> That may or may not be efficient enough for your application.

	This is probably enough, provided I can write
conversions to/from ints.

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-23  2:08       ` Max Skaller
@ 2000-03-23  7:50         ` Sven LUTHER
  2000-03-24  2:50         ` Jacques Garrigue
  2000-03-24 14:50         ` Xavier Leroy
  2 siblings, 0 replies; 14+ messages in thread
From: Sven LUTHER @ 2000-03-23  7:50 UTC (permalink / raw)
  To: Max Skaller; +Cc: John Max Skaller, caml-list

On Thu, Mar 23, 2000 at 01:08:54PM +1100, Max Skaller wrote:
> Sven LUTHER wrote:
> > 
> > On Wed, Mar 22, 2000 at 09:22:15AM +1100, John Max Skaller wrote:
> > > I have some code for processing ISO-10646 characters and UTF-8,
> > > which uses caml integers. ISO-10646 has 2^31 code points, which
> > > can be covered by caml integers on a 32bit machine. Using an
> > > unboxed type is mandatory for performance.
> > >
> > > Unfortunately, caml integers are signed, which makes most of the
> > > code I have written wrong (I haven't taken the care to handle
> > > integers over 2^30 correctly).
> > >
> > > What is the best way to handle this problem?
> > > Would a (standard?) library module (written in C), that treats
> > > integers as unsigned be a reasonable solution?
> > >
> > > [This may require writing 'uint_add x y' instead of 'x+y',
> > > but that doesn't matter in the above mentioned application,
> > > since the integers are being used to represent characters]
> > 
> > Just use the caml integer and ignore the fact that they are signed ?
> > 
> > after the moto :  that doesn't matter in the above mentioned application,
> 
> Perhaps my explanation was unclear. In my code, I must 
> calculate a UTF-8 encoding from a ISO-10646 code point,
> and calculate an ISO-10646 code point from a UTF-8 encoding.
> 
> The code is below. The code works for values <2^30,
> but fails when and int goes negative.
> 
> I would be happy to replace, in this code,
> evey use of 'lor', 'land', + - * < etc with
> 'ulor' 'uland' 'uplus' 'uminus' 'uless' etc, if only
> I could define them. (I could do this in C .. but then,
> I could write the below routines in C too)
> 

just redefine the above mentioned operations in caml, taking the overflow in
account, it should not be too difficult, altough it should be a bit less
efficient than the normal +, -, ... (altough i am not sure about it, maybe you
could just ignore it and use the normal functions. At least for + and - it
should work without problem.

to test them, use a function to print the type as unsigned int, and use
#install_printer to use it as default printer for ints.

Friendly

Sven LUTHER



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-22 17:05     ` Jean-Christophe Filliatre
  2000-03-22 19:10       ` Markus Mottl
@ 2000-03-23  2:41       ` Max Skaller
  1 sibling, 0 replies; 14+ messages in thread
From: Max Skaller @ 2000-03-23  2:41 UTC (permalink / raw)
  To: Jean-Christophe Filliatre; +Cc: John Max Skaller, caml-list

Jean-Christophe Filliatre wrote:
 
> I wrote such a C library to handle (boxed) 32 or 64 bits integers (you
> can find  it on my  web page).  But it appeared  that it was  not very
> efficient, and  when I rewrote my  program using an  encoding with two
> Caml integers, it was really faster.
> 
> So I  would suggest you to  write such a  library in Caml. For  a good
> starting point, you  may have a look at the  module Nativeint in ocaml
> sources (in utils/nativeint.ml).

OK. This is probably the way to do it.

-- 
John (Max) Skaller at OTT [Open Telecommications Ltd]
mailto:maxs@in.ot.com.au      -- at work
mailto:skaller@maxtal.com.au  -- at home



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-22 16:22     ` Sven LUTHER
@ 2000-03-23  2:08       ` Max Skaller
  2000-03-23  7:50         ` Sven LUTHER
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Max Skaller @ 2000-03-23  2:08 UTC (permalink / raw)
  To: luther; +Cc: John Max Skaller, caml-list

Sven LUTHER wrote:
> 
> On Wed, Mar 22, 2000 at 09:22:15AM +1100, John Max Skaller wrote:
> > I have some code for processing ISO-10646 characters and UTF-8,
> > which uses caml integers. ISO-10646 has 2^31 code points, which
> > can be covered by caml integers on a 32bit machine. Using an
> > unboxed type is mandatory for performance.
> >
> > Unfortunately, caml integers are signed, which makes most of the
> > code I have written wrong (I haven't taken the care to handle
> > integers over 2^30 correctly).
> >
> > What is the best way to handle this problem?
> > Would a (standard?) library module (written in C), that treats
> > integers as unsigned be a reasonable solution?
> >
> > [This may require writing 'uint_add x y' instead of 'x+y',
> > but that doesn't matter in the above mentioned application,
> > since the integers are being used to represent characters]
> 
> Just use the caml integer and ignore the fact that they are signed ?
> 
> after the moto :  that doesn't matter in the above mentioned application,

Perhaps my explanation was unclear. In my code, I must 
calculate a UTF-8 encoding from a ISO-10646 code point,
and calculate an ISO-10646 code point from a UTF-8 encoding.

The code is below. The code works for values <2^30,
but fails when and int goes negative.

I would be happy to replace, in this code,
evey use of 'lor', 'land', + - * < etc with
'ulor' 'uland' 'uplus' 'uminus' 'uless' etc, if only
I could define them. (I could do this in C .. but then,
I could write the below routines in C too)

Note these operations MUST be extremely fast,
and in particular, compact storage of ISO-10646
code points in arrays of integers is OK,
while arrays of boxed values is out of the question.
(So I can't use int32).

-------------------------------------------------------


let parse_utf8 (s : string)  (i : int) : int * int =
  let ord = int_of_char 
  and n = (String.length s)  - i
  in if n <= 0 then begin print_endline "FAILURE"; (-1),i end
  else let lead = ord (s.[i]) in
    if (lead land 0x80) = 0 then 
      lead land 0x7F,i+1 (* ASCII *)
    else if lead land 0xE0 = 0xC0 && n > 1 then
      ((lead land 0x1F)  lsl  6) lor
        (ord(s.[i+1]) land 0x3F),i+2
    else if lead land 0xF0 = 0xE0 && n > 2 then
      ((lead land 0x1F) lsl 12) lor
        ((ord(s.[i+1]) land 0x3F)  lsl 6) lor
        (ord(s.[i+2]) land 0x3F),i+3
    else if lead land 0xF8 = 0xF0 && n > 3 then
      ((lead land 0x1F) lsl 18) lor
        ((ord(s.[i+1]) land 0x3F)  lsl 12) lor
        ((ord(s.[i+2]) land 0x3F)  lsl 6) lor
        (ord(s.[i+3]) land 0x3F),i+4
    else if lead land 0xFC = 0xF8 && n > 4 then
      ((lead land 0x1F) lsl 24) lor 
        ((ord(s.[i+1]) land 0x3F)  lsl 18) lor
        ((ord(s.[i+2]) land 0x3F)  lsl 12) lor
        ((ord(s.[i+3]) land 0x3F)  lsl 6) lor
        (ord(s.[i+4]) land 0x3F),i+5
    else if lead land 0xFE = 0xFC && n > 5 then
      ((lead land 0x1F) lsl 30) lor
        ((ord(s.[i+1]) land 0x3F)  lsl 24) lor
        ((ord(s.[i+2]) land 0x3F)  lsl 18) lor
        ((ord(s.[i+3]) land 0x3F)  lsl 12) lor
        ((ord(s.[i+4]) land 0x3F)  lsl 6) lor
        (ord(s.[i+5]) land 0x3F),i+6
    else lead, i+1  (* error, just use bad character *)

(* convert an integer into a utf-8 encoded string of bytes *)
let utf8_of_int i =
  let chr x = String.make 1 (Char.chr x) in
  if i < 0x80 then 
     chr(i)
  else if i < 0x800 then 
     chr(0xC0 lor ((i lsr 6) land 0x1F))  ^
      chr(0x80 lor (i land 0x3F))
  else if i < 0x10000 then 
     chr(0xE0 lor ((i lsr 12) land 0xF)) ^
      chr(0x80 lor ((i lsr 6) land 0x3F)) ^
      chr(0x80 lor (i land 0x3F))
  else if i < 0x200000 then 
     chr(0xF0 lor ((i lsr 18) land 0x7)) ^
      chr(0x80 lor ((i lsr 12) land 0x3F)) ^
      chr(0x80 lor ((i lsr 6) land 0x3F)) ^
      chr(0x80 lor (i land 0x3F))
  else if i < 0x4000000 then 
     chr(0xF8 lor ((i lsr 24) land 0x3)) ^
      chr(0x80 lor ((i lsr 18) land 0x3F)) ^
      chr(0x80 lor ((i lsr 12) land 0x3F)) ^
      chr(0x80 lor ((i lsr 6) land 0x3F)) ^
      chr(0x80 lor (i land 0x3F))
  else chr(0xFC lor ((i lsr 30) land 0x1)) ^
    chr(0x80 lor ((i lsr 24) land 0x3F)) ^
    chr(0x80 lor ((i lsr 18) land 0x3F)) ^
    chr(0x80 lor ((i lsr 12) land 0x3F)) ^
    chr(0x80 lor ((i lsr 6) land 0x3F)) ^
    chr(0x80 lor (i land 0x3F))


	

-- 
John (Max) Skaller at OTT [Open Telecommications Ltd]
mailto:maxs@in.ot.com.au      -- at work
mailto:skaller@maxtal.com.au  -- at home



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-21 22:22   ` Unsigned integers? John Max Skaller
  2000-03-22 16:22     ` Sven LUTHER
  2000-03-22 17:05     ` Jean-Christophe Filliatre
@ 2000-03-22 19:47     ` Xavier Leroy
  2000-03-23 12:55       ` John Max Skaller
  2 siblings, 1 reply; 14+ messages in thread
From: Xavier Leroy @ 2000-03-22 19:47 UTC (permalink / raw)
  To: John Max Skaller; +Cc: caml-list

> I have some code for processing ISO-10646 characters and UTF-8,
> which uses caml integers. ISO-10646 has 2^31 code points, which
> can be covered by caml integers on a 32bit machine. Using an
> unboxed type is mandatory for performance.

OCaml 3.00 includes three new library modules, Int32, Int64 and
Nativeint, implementing (boxed) 32-bit, 64-bit and platform-native
integers, resepctively.  (Platform-native integers are 32 bits on 32
bit processors and 64 bits on 64 bit processors).  The native-code
compiler was modified to inline the operations on those types,
including elimination of unnecessary boxing/unboxing, like for floats.
That may or may not be efficient enough for your application.  

> Unfortunately, caml integers are signed, which makes most of the
> code I have written wrong (I haven't taken the care to handle
> integers over 2^30 correctly).

Actually, on 2's-complement machines at least, arithmetic operations
over usigned integers are exactly identical to those over signed
integers of the same size, except divisio, modulus, and
comparisons <, >, <=, >=.   So, for your application, Caml's "int"
type could be good enough, although you may need special comparison
functions (which you can write in C, using casts to unsigned long int,
or in Caml, by treating the sign bit specially).

- Xavier Leroy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-22 17:05     ` Jean-Christophe Filliatre
@ 2000-03-22 19:10       ` Markus Mottl
  2000-03-23  2:41       ` Max Skaller
  1 sibling, 0 replies; 14+ messages in thread
From: Markus Mottl @ 2000-03-22 19:10 UTC (permalink / raw)
  To: filliatr; +Cc: OCAML

> So I  would suggest you to  write such a  library in Caml. For  a good
> starting point, you  may have a look at the  module Nativeint in ocaml
> sources (in utils/nativeint.ml).

Or even more conveniently: check out the current CVS-repository at INRIA!
Seems that the problems with integers are soon going to be history...

  -> ocaml/stdlib/int32.mli
     ocaml/stdlib/int64.mli

Though, I fear that unboxed, complete native integers will never be
supported. Anyway, if you only need complete 32-bit-ints, you may as well
purchase a real processor (Alpha)... ;-)

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-21 22:22   ` Unsigned integers? John Max Skaller
  2000-03-22 16:22     ` Sven LUTHER
@ 2000-03-22 17:05     ` Jean-Christophe Filliatre
  2000-03-22 19:10       ` Markus Mottl
  2000-03-23  2:41       ` Max Skaller
  2000-03-22 19:47     ` Xavier Leroy
  2 siblings, 2 replies; 14+ messages in thread
From: Jean-Christophe Filliatre @ 2000-03-22 17:05 UTC (permalink / raw)
  To: John Max Skaller; +Cc: caml-list


In his message of Wed March 22, 2000, John Max Skaller writes: 
> 
> Unfortunately, caml integers are signed, which makes most of the
> code I have written wrong (I haven't taken the care to handle
> integers over 2^30 correctly).
> 
> What is the best way to handle this problem?
> Would a (standard?) library module (written in C), that treats
> integers as unsigned be a reasonable solution?

I wrote such a C library to handle (boxed) 32 or 64 bits integers (you
can find  it on my  web page).  But it appeared  that it was  not very
efficient, and  when I rewrote my  program using an  encoding with two
Caml integers, it was really faster.

So I  would suggest you to  write such a  library in Caml. For  a good
starting point, you  may have a look at the  module Nativeint in ocaml
sources (in utils/nativeint.ml).

Best regards,
-- 
Jean-Christophe Filliatre    
  Computer Science Laboratory   Phone (650) 859-5173
  SRI International             FAX   (650) 859-2844
  333 Ravenswood Ave.           email  filliatr@csl.sri.com
  Menlo Park, CA 94025, USA     web    http://www.csl.sri.com/~filliatr

  



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Unsigned integers?
  2000-03-21 22:22   ` Unsigned integers? John Max Skaller
@ 2000-03-22 16:22     ` Sven LUTHER
  2000-03-23  2:08       ` Max Skaller
  2000-03-22 17:05     ` Jean-Christophe Filliatre
  2000-03-22 19:47     ` Xavier Leroy
  2 siblings, 1 reply; 14+ messages in thread
From: Sven LUTHER @ 2000-03-22 16:22 UTC (permalink / raw)
  To: John Max Skaller; +Cc: caml-list

On Wed, Mar 22, 2000 at 09:22:15AM +1100, John Max Skaller wrote:
> I have some code for processing ISO-10646 characters and UTF-8,
> which uses caml integers. ISO-10646 has 2^31 code points, which
> can be covered by caml integers on a 32bit machine. Using an
> unboxed type is mandatory for performance.
> 
> Unfortunately, caml integers are signed, which makes most of the
> code I have written wrong (I haven't taken the care to handle
> integers over 2^30 correctly).
> 
> What is the best way to handle this problem?
> Would a (standard?) library module (written in C), that treats
> integers as unsigned be a reasonable solution?
> 
> [This may require writing 'uint_add x y' instead of 'x+y',
> but that doesn't matter in the above mentioned application,
> since the integers are being used to represent characters]

Just use the caml integer and ignore the fact that they are signed ?

after the moto :  that doesn't matter in the above mentioned application,
since the integers are being used to represent characters] 

But then i don't know what you use it for ...

And also, you would have to check exactly how integer overflow work, but in my
experience max_int+1 = min_int.

Friendly,

Sven LUTHER



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Unsigned integers?
  2000-03-16  2:55 ` Jacques Garrigue
@ 2000-03-21 22:22   ` John Max Skaller
  2000-03-22 16:22     ` Sven LUTHER
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: John Max Skaller @ 2000-03-21 22:22 UTC (permalink / raw)
  Cc: caml-list

I have some code for processing ISO-10646 characters and UTF-8,
which uses caml integers. ISO-10646 has 2^31 code points, which
can be covered by caml integers on a 32bit machine. Using an
unboxed type is mandatory for performance.

Unfortunately, caml integers are signed, which makes most of the
code I have written wrong (I haven't taken the care to handle
integers over 2^30 correctly).

What is the best way to handle this problem?
Would a (standard?) library module (written in C), that treats
integers as unsigned be a reasonable solution?

[This may require writing 'uint_add x y' instead of 'x+y',
but that doesn't matter in the above mentioned application,
since the integers are being used to represent characters]

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2000-03-27 17:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-03-23 19:42 Unsigned integers? Damien Doligez
  -- strict thread matches above, loose matches on Subject: below --
2000-03-15 13:58 Syntax for label, NEW PROPOSAL Pierre Weis
2000-03-16  2:55 ` Jacques Garrigue
2000-03-21 22:22   ` Unsigned integers? John Max Skaller
2000-03-22 16:22     ` Sven LUTHER
2000-03-23  2:08       ` Max Skaller
2000-03-23  7:50         ` Sven LUTHER
2000-03-24  2:50         ` Jacques Garrigue
2000-03-24 15:59           ` Xavier Leroy
2000-03-25  4:03           ` John Max Skaller
2000-03-24 14:50         ` Xavier Leroy
2000-03-22 17:05     ` Jean-Christophe Filliatre
2000-03-22 19:10       ` Markus Mottl
2000-03-23  2:41       ` Max Skaller
2000-03-22 19:47     ` Xavier Leroy
2000-03-23 12:55       ` John Max Skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox