* Strings @ 2009-04-03 11:56 Jon Harrop 2009-04-03 12:25 ` [Caml-list] Strings Paolo Donadeo ` (2 more replies) 0 siblings, 3 replies; 47+ messages in thread From: Jon Harrop @ 2009-04-03 11:56 UTC (permalink / raw) To: caml-list I read that batteries included provides first-class rope-based strings and I was just reading up on some horror stories about immutable strings on StackOverflow. This made me wonder what people's thoughts are about mutable vs immutable strings? I had never thought about it until I started porting my OCaml code to F# where strings are immutable and I found immutable strings to be a PITA. Immutable array-based strings seem pointless to me but I'd still appreciate an immutable rope-based string... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 11:56 Strings Jon Harrop @ 2009-04-03 12:25 ` Paolo Donadeo 2009-04-03 14:18 ` Ashish Agarwal 2009-04-04 9:14 ` David Rajchenbach-Teller 2 siblings, 0 replies; 47+ messages in thread From: Paolo Donadeo @ 2009-04-03 12:25 UTC (permalink / raw) To: caml-list On Fri, Apr 3, 2009 at 13:56, Jon Harrop wrote: > was just reading up on some horror stories about immutable strings on > StackOverflow. Can you post any link? -- Paolo ~ ~ :wq ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 11:56 Strings Jon Harrop 2009-04-03 12:25 ` [Caml-list] Strings Paolo Donadeo @ 2009-04-03 14:18 ` Ashish Agarwal 2009-04-03 14:46 ` Jon Harrop 2009-04-04 9:14 ` David Rajchenbach-Teller 2 siblings, 1 reply; 47+ messages in thread From: Ashish Agarwal @ 2009-04-03 14:18 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list [-- Attachment #1: Type: text/plain, Size: 1053 bytes --] > I found immutable strings to be a PITAin what way? On Fri, Apr 3, 2009 at 7:56 AM, Jon Harrop <jon@ffconsultancy.com> wrote: > > I read that batteries included provides first-class rope-based strings and > I > was just reading up on some horror stories about immutable strings on > StackOverflow. This made me wonder what people's thoughts are about mutable > vs immutable strings? > > I had never thought about it until I started porting my OCaml code to F# > where > strings are immutable and I found immutable strings to be a PITA. Immutable > array-based strings seem pointless to me but I'd still appreciate an > immutable rope-based string... > > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com/?e > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > [-- Attachment #2: Type: text/html, Size: 1730 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 14:18 ` Ashish Agarwal @ 2009-04-03 14:46 ` Jon Harrop 2009-04-03 15:03 ` Daniel Bünzli 0 siblings, 1 reply; 47+ messages in thread From: Jon Harrop @ 2009-04-03 14:46 UTC (permalink / raw) To: caml-list On Friday 03 April 2009 15:18:51 Ashish Agarwal wrote: > > I found immutable strings to be a PITA > > in what way? Just because my OCaml programs were mutating strings and translating that into F# is non-trivial if the string is shared or big. In essence, I've always used OCaml's strings as a more efficient byte array. In fact, the best translation to F# is often to use byte arrays as a replacement for strings. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 14:46 ` Jon Harrop @ 2009-04-03 15:03 ` Daniel Bünzli 2009-04-03 16:52 ` Martin Jambon 2009-04-04 10:20 ` Jon Harrop 0 siblings, 2 replies; 47+ messages in thread From: Daniel Bünzli @ 2009-04-03 15:03 UTC (permalink / raw) To: OCaml List Le 3 avr. 09 à 16:46, Jon Harrop a écrit : > Just because my OCaml programs were mutating strings and translating > that into > F# is non-trivial if the string is shared or big. In essence, I've > always > used OCaml's strings as a more efficient byte array. In fact, the best > translation to F# is often to use byte arrays as a replacement for > strings. So immutable strings are not a "PITA" you are just using them for something they should not be taken for (mutable byte arrays). Daniel ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 15:03 ` Daniel Bünzli @ 2009-04-03 16:52 ` Martin Jambon 2009-04-03 17:50 ` Daniel Bünzli ` (2 more replies) 2009-04-04 10:20 ` Jon Harrop 1 sibling, 3 replies; 47+ messages in thread From: Martin Jambon @ 2009-04-03 16:52 UTC (permalink / raw) To: Daniel Bünzli; +Cc: OCaml List Daniel Bünzli wrote: > > Le 3 avr. 09 à 16:46, Jon Harrop a écrit : > >> Just because my OCaml programs were mutating strings and translating >> that into >> F# is non-trivial if the string is shared or big. In essence, I've always >> used OCaml's strings as a more efficient byte array. In fact, the best >> translation to F# is often to use byte arrays as a replacement for >> strings. > > So immutable strings are not a "PITA" you are just using them for > something they should not be taken for (mutable byte arrays). I love this recurrent discussion! Here is my firm point of view which hasn't changed over the years and hundreds of millions documents processed: - I see absolutely no practical advantage of having an immutable "character string" type. - There is nothing to change in OCaml's string type because it is an "array of bytes", with type char representing single bytes. Martin -- http://mjambon.com/ ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 16:52 ` Martin Jambon @ 2009-04-03 17:50 ` Daniel Bünzli 2009-04-03 19:46 ` Paolo Donadeo ` (3 more replies) 2009-04-03 18:24 ` Florian Hars 2009-04-03 20:34 ` Arnaud Spiwack 2 siblings, 4 replies; 47+ messages in thread From: Daniel Bünzli @ 2009-04-03 17:50 UTC (permalink / raw) To: OCaml List Le 3 avr. 09 à 18:52, Martin Jambon a écrit : > I love this recurrent discussion! I love your carefully argumented response ! > - I see absolutely no practical advantage of having an immutable > "character > string" type. In fact I find the result of the following sequence of operations very disappointing for a functional programming language : Objective Caml version 3.11.0 # Sys.os_type;; - : string = "Unix" # let s = Sys.os_type;; val s : string = "Unix" # s.[0] <- 'a';; - : unit = () # Sys.os_type;; - : string = "anix" I think it is a design error to conflate strings and byte arrays. You clearly want both, but each with its own type and strings as immutable. Individual character mutability is rarely needed in text processing and having immutable strings avoids the kind of quirks as seen above. You'll think that's a marginal example, but that actually happens in practice. For example in xmlm when I return a signal for a start tag I do not String.copy the tag name to avoid allocating too much. Thus in the documentation there's the following ugly advice : "The module assumes strings are immutable, thus strings the client gives or receives during the input and output process must not be modified." And if you don't follow the advice and mutate the tag's name before the end tag was parsed (or output) you'll get a tag mismatch error even though the document (or the output) is perfectly valid. Having immutable strings would not rely on the client for correctness of operation and that's always an advantage. Of course you'll tell me just use String.copy inside xmlm et voilà, but then you traded correctness for performance in a case where you could have both with immutable strings. > - There is nothing to change in OCaml's string type because it is an > "array of > bytes", with type char representing single bytes. Oh no, there's nothing to change at all, that's a perfect implementation of byte arrays. You just want another type for immutable strings. Best, Daniel ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 17:50 ` Daniel Bünzli @ 2009-04-03 19:46 ` Paolo Donadeo 2009-04-03 20:41 ` Harrison, John R 2009-04-04 10:13 ` Jon Harrop 2009-04-03 21:44 ` Goswin von Brederlow ` (2 subsequent siblings) 3 siblings, 2 replies; 47+ messages in thread From: Paolo Donadeo @ 2009-04-03 19:46 UTC (permalink / raw) To: OCaml mailing list > You clearly want both, but each with its own type and strings as immutable. > Individual character mutability is rarely needed in text processing I can agree with you on this argument, but a question still remains: why should you ever do things like: > # s.[0] <- 'a';; Regards, -- Paolo ~ ~ :wq ^ permalink raw reply [flat|nested] 47+ messages in thread
* RE: [Caml-list] Strings 2009-04-03 19:46 ` Paolo Donadeo @ 2009-04-03 20:41 ` Harrison, John R 2009-04-04 10:11 ` Jon Harrop 2009-04-04 10:13 ` Jon Harrop 1 sibling, 1 reply; 47+ messages in thread From: Harrison, John R @ 2009-04-03 20:41 UTC (permalink / raw) To: Paolo Donadeo; +Cc: OCaml | I can agree with you on this argument, but a question still remains: | why should you ever do things like: | | > # s.[0] <- 'a';; The point is that it might not be your own code that does it, but a function written by someone else to which you innocently pass a string argument. You may think you're writing purely functional code, but unless you've checked all the other functions you're using that accept or return strings, who knows? Even if you aren't a doctrinaire functional programmer, the use of mutable strings as the default weakens the abstraction boundary provided by functions and procedures, and surely everyone agrees on the importance of that. John. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 20:41 ` Harrison, John R @ 2009-04-04 10:11 ` Jon Harrop 2009-04-04 11:12 ` David Teller 0 siblings, 1 reply; 47+ messages in thread From: Jon Harrop @ 2009-04-04 10:11 UTC (permalink / raw) To: caml-list On Friday 03 April 2009 21:41:26 Harrison, John R wrote: > | I can agree with you on this argument, but a question still remains: > | > | why should you ever do things like: > | > # s.[0] <- 'a';; > > The point is that it might not be your own code that does it, but a > function written by someone else to which you innocently pass a string > argument. You may think you're writing purely functional code... Why? This data structure is a mutable array of bytes and should be treated as such. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 10:11 ` Jon Harrop @ 2009-04-04 11:12 ` David Teller 2009-04-04 11:40 ` Jon Harrop 2009-04-18 12:31 ` Arkady Andrukonis 0 siblings, 2 replies; 47+ messages in thread From: David Teller @ 2009-04-04 11:12 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list The bad thing is that, whenever you have to return text in an otherwise functional program, you need to enter "mutable array of bytes" land. You can't just assume that the user isn't going to modify that string, because, they can, possibly by accident, and any invariant relying on the fact that your strings can't change are going to be broken. In particular, any StringSet, any StringMap, etc. Cheers, David On Sat, 2009-04-04 at 11:11 +0100, Jon Harrop wrote: > Why? This data structure is a mutable array of bytes and should be treated as > such. > ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 11:12 ` David Teller @ 2009-04-04 11:40 ` Jon Harrop 2009-04-04 12:34 ` David Rajchenbach-Teller 2009-04-18 12:31 ` Arkady Andrukonis 1 sibling, 1 reply; 47+ messages in thread From: Jon Harrop @ 2009-04-04 11:40 UTC (permalink / raw) To: caml-list On Saturday 04 April 2009 12:12:52 David Teller wrote: > The bad thing is that, whenever you have to return text in an otherwise > functional program, you need to enter "mutable array of bytes" land. You > can't just assume that the user isn't going to modify that string, > because, they can, possibly by accident, and any invariant relying on > the fact that your strings can't change are going to be broken. In > particular, any StringSet, any StringMap, etc. Sure but that is no different to arrays and an ArraySet, ArrayMap etc. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 11:40 ` Jon Harrop @ 2009-04-04 12:34 ` David Rajchenbach-Teller 0 siblings, 0 replies; 47+ messages in thread From: David Rajchenbach-Teller @ 2009-04-04 12:34 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list On Sat, 2009-04-04 at 12:40 +0100, Jon Harrop wrote: > On Saturday 04 April 2009 12:12:52 David Teller wrote: > > The bad thing is that, whenever you have to return text in an otherwise > > functional program, you need to enter "mutable array of bytes" land. You > > can't just assume that the user isn't going to modify that string, > > because, they can, possibly by accident, and any invariant relying on > > the fact that your strings can't change are going to be broken. In > > particular, any StringSet, any StringMap, etc. > > Sure but that is no different to arrays and an ArraySet, ArrayMap etc. Of course. The thing is that when I'm using arrays, I'm expecting to enter mutable-land. When I'm manipulating text, most of the time, I don't. Cheers, David ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 11:12 ` David Teller 2009-04-04 11:40 ` Jon Harrop @ 2009-04-18 12:31 ` Arkady Andrukonis 1 sibling, 0 replies; 47+ messages in thread From: Arkady Andrukonis @ 2009-04-18 12:31 UTC (permalink / raw) To: Caml-list Hi, It appears that Java sidesteps this issue by making string classes final, so that whenever a string object gets modified a new copy of it is generated, giving the appearance of mutable strings, but not really so. It seems Caml is on the right track with immutable strings, we have work-around methods for those occasions where relegating a state change into a confined area is a workable solution. Otherwise the benefit of functional programming would be lost. kadee --- On Sat, 4/4/09, David Teller <David.Teller@univ-orleans.fr> wrote: > From: David Teller <David.Teller@univ-orleans.fr> > Subject: Re: [Caml-list] Strings > To: "Jon Harrop" <jon@ffconsultancy.com> > Cc: caml-list@yquem.inria.fr > Date: Saturday, April 4, 2009, 7:12 AM > The bad thing is that, whenever you have to return text in > an otherwise > functional program, you need to enter "mutable array > of bytes" land. You > can't just assume that the user isn't going to > modify that string, > because, they can, possibly by accident, and any invariant > relying on > the fact that your strings can't change are going to be > broken. In > particular, any StringSet, any StringMap, etc. > > Cheers, > David > > On Sat, 2009-04-04 at 11:11 +0100, Jon Harrop wrote: > > Why? This data structure is a mutable array of bytes > and should be treated as > > such. > > > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: > http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 19:46 ` Paolo Donadeo 2009-04-03 20:41 ` Harrison, John R @ 2009-04-04 10:13 ` Jon Harrop 1 sibling, 0 replies; 47+ messages in thread From: Jon Harrop @ 2009-04-04 10:13 UTC (permalink / raw) To: caml-list On Friday 03 April 2009 20:46:26 Paolo Donadeo wrote: > > You clearly want both, but each with its own type and strings as > > immutable. Individual character mutability is rarely needed in text > > processing > > I can agree with you on this argument, but a question still remains: > > why should you ever do things like: > > # s.[0] <- 'a';; Exactly. I find that to be useful and not a source of errors. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 17:50 ` Daniel Bünzli 2009-04-03 19:46 ` Paolo Donadeo @ 2009-04-03 21:44 ` Goswin von Brederlow 2009-04-04 9:10 ` David Rajchenbach-Teller 2009-04-04 17:11 ` [Caml-list] Strings Kuba Ober 2009-04-05 20:54 ` Richard Jones 3 siblings, 1 reply; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-03 21:44 UTC (permalink / raw) To: Daniel Buenzli; +Cc: OCaml List Daniel Bünzli <daniel.buenzli@erratique.ch> writes: > Le 3 avr. 09 à 18:52, Martin Jambon a écrit : > >> I love this recurrent discussion! > > I love your carefully argumented response ! > >> - I see absolutely no practical advantage of having an immutable >> "character >> string" type. > > In fact I find the result of the following sequence of operations very > disappointing for a functional programming language : > > Objective Caml version 3.11.0 > > # Sys.os_type;; > - : string = "Unix" > # let s = Sys.os_type;; > val s : string = "Unix" > # s.[0] <- 'a';; > - : unit = () > # Sys.os_type;; > - : string = "anix" > > I think it is a design error to conflate strings and byte arrays. You > clearly want both, but each with its own type and strings as > immutable. Individual character mutability is rarely needed in text > processing and having immutable strings avoids the kind of quirks as > seen above. I think that is a design flaw in Sys. Strings are mutable. The os_type is a constant. It should not hand out mutable access to a constant. With the current string module a better way would be to return a copy of os_type on each invocation. Drawback there is that then Sys.os_type () != Sys.os_type () > You'll think that's a marginal example, but that actually happens in > practice. For example in xmlm when I return a signal for a start tag I > do not String.copy the tag name to avoid allocating too much. Thus in > the documentation there's the following ugly advice : > > "The module assumes strings are immutable, thus strings the client > gives or receives during the input and output process must not be > modified." > > And if you don't follow the advice and mutate the tag's name before > the end tag was parsed (or output) you'll get a tag mismatch error > even though the document (or the output) is perfectly valid. > > Having immutable strings would not rely on the client for correctness > of operation and that's always an advantage. Of course you'll tell me > just use String.copy inside xmlm et voilà, but then you traded > correctness for performance in a case where you could have both with > immutable strings. This is not just a problem for strings. Any data type can suffer the same. >> - There is nothing to change in OCaml's string type because it is an >> "array of >> bytes", with type char representing single bytes. > > > Oh no, there's nothing to change at all, that's a perfect > implementation of byte arrays. You just want another type for > immutable strings. > > Best, > > Daniel It wouldn't be too hard to change the string module to allow for both mutable and immutable strings: module S : sig type const type mutabl type 'a t val make : string -> mutabl t val set : mutabl t -> int -> char -> unit val get : 'a t -> int -> char val const : 'a t -> const t val print : 'a t -> unit end = struct type const type mutabl type 'a t = string let make s = s let set = String.set let get = String.get let const s = s let print = print_string end let str = S.make "hallo" in S.set str 0 'H'; S.print str let str = S.const (S.make "hallo") in S.set str 0 'H'; S.print str ^^^ Error: This expression has type S.const S.t but is here used with type S.mutabl S.t By adding a phantom type the type system can keep track of where a string is mutable and where not. The only restriction is that "const" does not mean the string will not change. It only means that that reference to the string can not change it: # let str = S.make "hallo" in let str2 = S.const str in S.set str 0 'H'; S.print str2;; Hallo- : unit = () If you let a mutable reference to the string escape and then assume it remains const that is your problem. Easily avoidable in a library or module. MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 21:44 ` Goswin von Brederlow @ 2009-04-04 9:10 ` David Rajchenbach-Teller 2009-04-05 10:06 ` Strings Zheng Li 0 siblings, 1 reply; 47+ messages in thread From: David Rajchenbach-Teller @ 2009-04-04 9:10 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: Daniel Buenzli, OCaml List On Fri, 2009-04-03 at 23:44 +0200, Goswin von Brederlow wrote: > It wouldn't be too hard to change the string module to allow for both > mutable and immutable strings: [...] Done in Batteries. # "foo";; (*OCaml base string*) - : string = "foo" # ro"foo";;(*Read-only string*) - : [ `Read ] Batteries.String.Cap.t = ro"foo" Cheers, David ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Strings 2009-04-04 9:10 ` David Rajchenbach-Teller @ 2009-04-05 10:06 ` Zheng Li 2009-04-06 9:20 ` Strings David Rajchenbach-Teller 0 siblings, 1 reply; 47+ messages in thread From: Zheng Li @ 2009-04-05 10:06 UTC (permalink / raw) To: David Rajchenbach-Teller; +Cc: Goswin von Brederlow, Daniel Buenzli, OCaml List On 4/4/2009 11:10 AM, David Rajchenbach-Teller wrote: > > On Fri, 2009-04-03 at 23:44 +0200, Goswin von Brederlow wrote: >> It wouldn't be too hard to change the string module to allow for both >> mutable and immutable strings: > > [...] > > Done in Batteries. > > # "foo";; (*OCaml base string*) > - : string = "foo" > # ro"foo";;(*Read-only string*) > - : [ `Read ] Batteries.String.Cap.t = ro"foo" With phantom type alone, the abstraction can still leak. ---- # open Batteries.String.Cap;; # let s = "asdf";; val s : string = "asdf" # let ro_s = read_only (of_string s);; val ro_s : [ `Read ] Batteries.String.Cap.t = ro"asdf" # s.[3] <- 'z';; # ro_s;; - : [ `Read ] Batteries.String.Cap.t = ro"asdz" ---- Well, this example is artificial. A more intuitive example would be (Look how similar it is with the previous os_type string example !): ---- # let ro_s = ro"asdf";; val ro_s : [ `Read ] Batteries.String.Cap.t = ro"asdf" # (to_string ro_s).[3] <- 'z';; .... ---- however, with no covariants defined in batteries' String.Cap.t type (why?), the second example won't compile. The compiler simply doesn't allow me to print out a read-only string, nor does it allow many reasonable things like <<join ro"asdf" ro"jkl">> etc. I understand that the phantom trick can't guarantee the immutability in a whole since the underlying data type (if exposed) can still be mutable. However, String.Cap.t is not dealing with arbitrary polymorphic types here! it deals with just the specific primitive string type, and controlling capacity over standard string is the only thing it intend to do. String.copy on the border might solve this problem, even though there could still be other issues left. All the best -- Zheng ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Strings 2009-04-05 10:06 ` Strings Zheng Li @ 2009-04-06 9:20 ` David Rajchenbach-Teller 2009-04-06 10:07 ` Strings Goswin von Brederlow 2009-04-06 11:03 ` Strings Zheng Li 0 siblings, 2 replies; 47+ messages in thread From: David Rajchenbach-Teller @ 2009-04-06 9:20 UTC (permalink / raw) To: Zheng Li; +Cc: Goswin von Brederlow, Daniel Buenzli, OCaml List On Sun, 2009-04-05 at 12:06 +0200, Zheng Li wrote: > With phantom type alone, the abstraction can still leak. Ok, it's actually not *immutable* strings but *read-only* strings. The objective is to be able to distribute a text and make sure that the client won't be able to modify it. The original owner can still decide to enter "mutable-land" if they need it. > > ---- > # let ro_s = ro"asdf";; > val ro_s : [ `Read ] Batteries.String.Cap.t = ro"asdf" > # (to_string ro_s).[3] <- 'z';; > .... > ---- > > however, with no covariants defined in batteries' String.Cap.t type > (why?), the second example won't compile. That's by design: [to_string] is the identity operation and it only applies to strings for which you have both read and write capabilities. If you wish to do what I believe you have in mind, you need to first [copy] the string. > The compiler simply doesn't > allow me to print out a read-only string, nor does it allow many > reasonable things like <<join ro"asdf" ro"jkl">> etc. Er, how can you "not print out a read-only string"? And for your [join] problem, well, works for me (as soon as you remember that it takes as second argument a *list* of readable strings). # print stdout (join ro";" [ro"1"; ro"2"; ro"3"]);; 1;2;3- : unit Cheers, David ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Strings 2009-04-06 9:20 ` Strings David Rajchenbach-Teller @ 2009-04-06 10:07 ` Goswin von Brederlow 2009-04-06 11:03 ` Strings Zheng Li 1 sibling, 0 replies; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-06 10:07 UTC (permalink / raw) To: David Rajchenbach-Teller Cc: Zheng Li, Goswin von Brederlow, Daniel Buenzli, OCaml List David Rajchenbach-Teller <David.Teller@ens-lyon.org> writes: > On Sun, 2009-04-05 at 12:06 +0200, Zheng Li wrote: >> With phantom type alone, the abstraction can still leak. > > Ok, it's actually not *immutable* strings but *read-only* strings. The > objective is to be able to distribute a text and make sure that the > client won't be able to modify it. The original owner can still decide > to enter "mutable-land" if they need it. Maybe there should be another capability `Const that can only be gained when creating/copying a string. let make_const : 'a string -> [`Const; `Read] string = String.copy Those would be truely immutable. The use would be for strings passed to libraries that the library wants to ensure are immutable so it can avoid copying them. MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Strings 2009-04-06 9:20 ` Strings David Rajchenbach-Teller 2009-04-06 10:07 ` Strings Goswin von Brederlow @ 2009-04-06 11:03 ` Zheng Li 1 sibling, 0 replies; 47+ messages in thread From: Zheng Li @ 2009-04-06 11:03 UTC (permalink / raw) To: David Rajchenbach-Teller; +Cc: Daniel Buenzli, OCaml List On 4/6/2009 11:20 AM, David Rajchenbach-Teller wrote: > Ok, it's actually not *immutable* strings but *read-only* strings. The > objective is to be able to distribute a text and make sure that the > client won't be able to modify it.The original owner can still decide > to enter "mutable-land" if they need it. Fair enough. But then users of the library still have to hard copy a batteries string to ensure it won't change in the future (which is the starting topic of the this thread). > That's by design: [to_string] is the identity operation and it only > applies to strings for which you have both read and write capabilities. > If you wish to do what I believe you have in mind, you need to first > [copy] the string. OK. >> The compiler simply doesn't >> allow me to print out a read-only string, nor does it allow many >> reasonable things like<<join ro"asdf" ro"jkl">> etc. > > Er, how can you "not print out a read-only string"? > > And for your [join] problem, well, works for me (as soon as you remember > that it takes as second argument a *list* of readable strings). > > # print stdout (join ro";" [ro"1"; ro"2"; ro"3"]);; > 1;2;3- : unit My mistake. Regards -- Zheng ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 17:50 ` Daniel Bünzli 2009-04-03 19:46 ` Paolo Donadeo 2009-04-03 21:44 ` Goswin von Brederlow @ 2009-04-04 17:11 ` Kuba Ober 2009-04-04 17:26 ` Jon Harrop 2009-04-05 20:54 ` Richard Jones 3 siblings, 1 reply; 47+ messages in thread From: Kuba Ober @ 2009-04-04 17:11 UTC (permalink / raw) To: caml-list On Apr 3, 2009, at 1:50 PM, Daniel Bünzli wrote: > Le 3 avr. 09 à 18:52, Martin Jambon a écrit : >> - I see absolutely no practical advantage of having an immutable >> "character >> string" type. > > In fact I find the result of the following sequence of operations > very disappointing for a functional programming language : > > Objective Caml version 3.11.0 > > # Sys.os_type;; > - : string = "Unix" > # let s = Sys.os_type;; > val s : string = "Unix" > # s.[0] <- 'a';; > - : unit = () > # Sys.os_type;; > - : string = "anix" Perhaps mutable data of any type should be markable as const, with the marking being permanent. Or maybe OCaml should have an immutable string wrapper that can be created from a mutable string? Such immutable strings would be returned where the string is not supposed to be further modified. That's where I like C++/Qt copy-on-write strings a lot. They make life easy. Cheers, Kuba ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 17:11 ` [Caml-list] Strings Kuba Ober @ 2009-04-04 17:26 ` Jon Harrop 0 siblings, 0 replies; 47+ messages in thread From: Jon Harrop @ 2009-04-04 17:26 UTC (permalink / raw) To: caml-list On Saturday 04 April 2009 18:11:20 Kuba Ober wrote: > On Apr 3, 2009, at 1:50 PM, Daniel Bünzli wrote: > > In fact I find the result of the following sequence of operations > > very disappointing for a functional programming language : > > > > Objective Caml version 3.11.0 > > > > # Sys.os_type;; > > - : string = "Unix" > > # let s = Sys.os_type;; > > val s : string = "Unix" > > # s.[0] <- 'a';; > > - : unit = () > > # Sys.os_type;; > > - : string = "anix" > > Perhaps mutable data of any type should be markable as const, with the > marking being permanent. Would a phantom type parameter be sufficient: [`RO|`RW] string > That's where I like C++/Qt copy-on-write strings a lot. They make life > easy. Just as long as they don't copy the entire string when I mutate a single char! -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 17:50 ` Daniel Bünzli ` (2 preceding siblings ...) 2009-04-04 17:11 ` [Caml-list] Strings Kuba Ober @ 2009-04-05 20:54 ` Richard Jones 2009-04-05 23:40 ` Daniel Bünzli 3 siblings, 1 reply; 47+ messages in thread From: Richard Jones @ 2009-04-05 20:54 UTC (permalink / raw) To: Daniel Bünzli; +Cc: OCaml List On Fri, Apr 03, 2009 at 07:50:48PM +0200, Daniel Bünzli wrote: > Having immutable strings would not rely on the client for correctness > of operation and that's always an advantage. Of course you'll tell me > just use String.copy inside xmlm et voilà, but then you traded > correctness for performance in a case where you could have both with > immutable strings. Could you not just define a new module which is String, but with a restricted interface (removing all the functions that can modify strings)? This is essentially what Batteries does, along with a bit of camlp4 [I think] to provide a rope construction syntax. Rich. -- Richard Jones Red Hat ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 20:54 ` Richard Jones @ 2009-04-05 23:40 ` Daniel Bünzli 0 siblings, 0 replies; 47+ messages in thread From: Daniel Bünzli @ 2009-04-05 23:40 UTC (permalink / raw) To: Richard Jones; +Cc: OCaml List Le 5 avr. 09 à 22:54, Richard Jones a écrit : > Could you not just define a new module which is String, but with a > restricted interface (removing all the functions that can modify > strings)? Yes. But I think that at a certain point the client will certainly want a regular string and to convert from this restricted representation to a string the module will have to String.copy. Best, Daniel P.S. If you have an immutable string implementation you can use the functor Xmlm.Make to use them. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 16:52 ` Martin Jambon 2009-04-03 17:50 ` Daniel Bünzli @ 2009-04-03 18:24 ` Florian Hars 2009-04-03 20:34 ` Arnaud Spiwack 2 siblings, 0 replies; 47+ messages in thread From: Florian Hars @ 2009-04-03 18:24 UTC (permalink / raw) To: Martin Jambon; +Cc: Daniel Bünzli, OCaml List Martin Jambon wrote: > - There is nothing to change in OCaml's string type because it is an "array of > bytes", with type char representing single bytes. $ ocaml Objective Caml version 3.10.2 # let c = 'ö';; Characters 8-9: let c = 'ö';; ^ Syntax error # Meanwhile, in another terminal window: $ ocaml Objective Caml version 3.10.2 # let c = 'ö';; val c : char = '\246' # I prefer my strings to be composed of chars that represent characters. - Florian. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 16:52 ` Martin Jambon 2009-04-03 17:50 ` Daniel Bünzli 2009-04-03 18:24 ` Florian Hars @ 2009-04-03 20:34 ` Arnaud Spiwack 2 siblings, 0 replies; 47+ messages in thread From: Arnaud Spiwack @ 2009-04-03 20:34 UTC (permalink / raw) To: Martin Jambon; +Cc: Daniel Bünzli, OCaml List If my experience is worth anything, I've had hardly any use of mutable character strings. Many of immutable ones. The practical advantage of having immutable character strings is the same as that of having immutable integer data : when you do not need to mutate data of a type, you'd better keep this type immutable (also internal representation of the type might very well vary seriously depending on it being mutable or not). Notice that with pa_do, I guess I can now define easily this type while using a string syntax for them (which is good). Thus Arnaud Spiwack Martin Jambon a écrit : > I love this recurrent discussion! > > Here is my firm point of view which hasn't changed over the years and hundreds > of millions documents processed: > > - I see absolutely no practical advantage of having an immutable "character > string" type. > > - There is nothing to change in OCaml's string type because it is an "array of > bytes", with type char representing single bytes. > > > > > Martin > > ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 15:03 ` Daniel Bünzli 2009-04-03 16:52 ` Martin Jambon @ 2009-04-04 10:20 ` Jon Harrop 1 sibling, 0 replies; 47+ messages in thread From: Jon Harrop @ 2009-04-04 10:20 UTC (permalink / raw) To: caml-list On Friday 03 April 2009 16:03:25 Daniel Bünzli wrote: > Le 3 avr. 09 à 16:46, Jon Harrop a écrit : > > Just because my OCaml programs were mutating strings and translating > > that into > > F# is non-trivial if the string is shared or big. In essence, I've > > always > > used OCaml's strings as a more efficient byte array. In fact, the best > > translation to F# is often to use byte arrays as a replacement for > > strings. > > So immutable strings are not a "PITA" you are just using them for > something they should not be taken for (mutable byte arrays). Except that mutable byte arrays are not treated as strings, e.g. they are pretty printed as a garbled pile of bytes. Moreover, immutable strings implemented on top of arrays have awful performance characteristics. If you're going to use immutable strings then implement it as a rope... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-03 11:56 Strings Jon Harrop 2009-04-03 12:25 ` [Caml-list] Strings Paolo Donadeo 2009-04-03 14:18 ` Ashish Agarwal @ 2009-04-04 9:14 ` David Rajchenbach-Teller 2009-04-04 9:26 ` Alp Mestan ` (3 more replies) 2 siblings, 4 replies; 47+ messages in thread From: David Rajchenbach-Teller @ 2009-04-04 9:14 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list On Fri, 2009-04-03 at 12:56 +0100, Jon Harrop wrote: > I read that batteries included provides first-class rope-based strings and I > was just reading up on some horror stories about immutable strings on > StackOverflow. This made me wonder what people's thoughts are about mutable > vs immutable strings? Note that Batteries provides * regular OCaml strings * strings with capabilities (i.e. strings which, depending on their type, can be read-only/write-only/read-write) -- sometimes faster than regular strings, never slower * immutable Unicode ropes. I personally can't remember the last time I've needed mutable strings in OCaml. On the other hand, I can remember a handful of times where, to return a constant string, I had to make a function that would rebuild the string at every call. Which is both needlessly slow and awkward for what looks like a constant. Cheers, David ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 9:14 ` David Rajchenbach-Teller @ 2009-04-04 9:26 ` Alp Mestan 2009-04-04 10:55 ` blue storm 2009-04-04 21:51 ` Goswin von Brederlow 2009-04-04 10:11 ` Jon Harrop ` (2 subsequent siblings) 3 siblings, 2 replies; 47+ messages in thread From: Alp Mestan @ 2009-04-04 9:26 UTC (permalink / raw) To: David Rajchenbach-Teller, caml-list [-- Attachment #1: Type: text/plain, Size: 1273 bytes --] On Sat, Apr 4, 2009 at 11:14 AM, David Rajchenbach-Teller < David.Teller@ens-lyon.org> wrote: > I personally can't remember the last time I've needed mutable strings in > OCaml. Neither do I. > On the other hand, I can remember a handful of times where, to > return a constant string, I had to make a function that would rebuild > the string at every call. Which is both needlessly slow and awkward for > what looks like a constant. > I think providing both capabilities is the best solution. However, let's study Haskell's strings. They simply are a list of characters. This let the ability to use heavily list-related functions (take, takeWhile, drop, dropWhile, map, etc.). On the other hand, OCaml's standard library lacks of many functions for strings ! I think this is too much imperative oriented. Maybe we could try to implement (for Batteries or in a separate project) string lists and then use the power of Batteries' list module with many (really many... that's a real pleasure to read its documentation) functions to work with. I think with a bit of internal laziness, we could get a great immutable string type with many functions to manipulate it. But I guess there are many cons to do so, otherwise it would have been "standardized". -- Alp Mestan [-- Attachment #2: Type: text/html, Size: 1785 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 9:26 ` Alp Mestan @ 2009-04-04 10:55 ` blue storm 2009-04-04 21:51 ` Goswin von Brederlow 1 sibling, 0 replies; 47+ messages in thread From: blue storm @ 2009-04-04 10:55 UTC (permalink / raw) To: caml-list On Sat, Apr 4, 2009 at 11:26 AM, Alp Mestan <alp@mestan.fr> wrote: > However, let's study Haskell's strings. > They simply are a list of characters. This let the ability to use heavily > list-related functions (take, takeWhile, drop, dropWhile, map, etc.). On the > other hand, OCaml's standard library lacks of many functions for strings ! I > think this is too much imperative oriented. Maybe we could try to implement > (for Batteries or in a separate project) string lists and then use the power > of Batteries' list module with many (really many... that's a real pleasure > to read its documentation) functions to work with. I think with a bit of > internal laziness, we could get a great immutable string type with many > functions to manipulate it. > > But I guess there are many cons to do so, otherwise it would have been > "standardized". This is actually a bad idea : strings and lists are used for different purpose. Some operations are common (takeWhile is probably a good example) but the concepts are differents (does "tail" as a primitive on a string really match your usage pattern ?) and using lists entails bad performance caracteristics (hence the need for byteString, etc., the lack of standardization among the different Haskell libraries using strings, and the general confusion resulting). In the (rare imho) cases where you exactly want a list of characters, you can use (string -> char list) (or string -> char Enum.t) conversion functions. In the immutable string land, we have much better representations (that is, implementations that match the common use case of strings better), such as ropes. Ropes provide (amortized ?) constant-time concatenation, wich I think is the killer feature. I very rarely use mutable strings, whereas in my experience string concatenation is a very common operation (more than random access). I expect performance benefits for most real programs. In some cases, byte arrays are still useful (as John Harrop noted), but I think they are anecdotical and I'm ready to explicitely convert my strings to a mutable format before performing any mutation. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 9:26 ` Alp Mestan 2009-04-04 10:55 ` blue storm @ 2009-04-04 21:51 ` Goswin von Brederlow 2009-04-04 23:35 ` Yaron Minsky 2009-04-05 2:55 ` Jon Harrop 1 sibling, 2 replies; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-04 21:51 UTC (permalink / raw) To: Alp Mestan; +Cc: David Rajchenbach-Teller, caml-list Alp Mestan <alp@mestan.fr> writes: > I think providing both capabilities is the best solution. Phantom types solvethis beautifully with not truntime penalty. GO BATTERIES! > However, let's study Haskell's strings. > They simply are a list of characters. This let the ability to use heavily > list-related functions (take, takeWhile, drop, dropWhile, map, etc.). On the The beauty of ocaml strings is that they are really compact. An ocaml string on 32bit is 5-8 bytes longer than the contained string and 9-16 bytes on 64bit. On the other hand a list of characters is 8/16 times as large. Not to mention that you have no random access. > other hand, OCaml's standard library lacks of many functions for strings ! I > think this is too much imperative oriented. Maybe we could try to implement > (for Batteries or in a separate project) string lists and then use the power > of Batteries' list module with many (really many... that's a real pleasure to > read its documentation) functions to work with. I think with a bit of internal > laziness, we could get a great immutable string type with many functions to > manipulate it. > But I guess there are many cons to do so, otherwise it would have been > "standardized". Mutable/Immutable can really nicely done with phantom types and is independent of the data structure used. It works for strings, lists, arrays, sets, trees, ... and I think all standard modules should have it. The official standard lib is rather lacking there but that is why there is Batteries. The more I hear/see of it the more I like it. As for char lists for strings you can always convert strings to char lists when that is a better representation. I've done so myself on a number of ocasions. But just as often I used strings as char arrays. Doing e.g. String.blit with char lists would be ugly. MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 21:51 ` Goswin von Brederlow @ 2009-04-04 23:35 ` Yaron Minsky 2009-04-05 9:36 ` David Rajchenbach-Teller 2009-04-05 2:55 ` Jon Harrop 1 sibling, 1 reply; 47+ messages in thread From: Yaron Minsky @ 2009-04-04 23:35 UTC (permalink / raw) To: caml-list [-- Attachment #1: Type: text/plain, Size: 2140 bytes --] On Sat, Apr 4, 2009 at 5:51 PM, Goswin von Brederlow <goswin-v-b@web.de>wrote: > Mutable/Immutable can really nicely done with phantom types and is > independent of the data structure used. It works for strings, lists, > arrays, sets, trees, ... and I think all standard modules should have > it. The official standard lib is rather lacking there but that is why > there is Batteries. The more I hear/see of it the more I like it. On this note, there's a small variation on this idea that we've experimented with at Jane Street that I think is worth mentioning. When people do this kind of thing, they usually have two phantom tags, "immutable" and "mutable", but, there is another natural one to add: "readonly". A mutable value is, as one would expect, a value that can be modified; and an immutable value is one that can not be modified. A readonly value is one that can not be modified, but that might change nonetheless because somewhere else in the program there is a mutable handle to the same underlying value. You implement this by making the interface for immutable and readonly values identical, except that a readonly value can be created from a mutable value without copying. Mutable and immutable are typically the most important types of access control to implement, but readonly can be useful in special cases, where you have a value that you really want to mutate, but that you want to control where in the program the mutation can happen. There's one extra tradeoff worth mentioning with using phantom types for controlling mutability, which is that values made immutable in this way are not quite as good as values that are immutable at the lowest level. The reason is that ordinary immutable values are understood by the type system as such, and this goes into the compiler's understanding of the value restriction and variance. This means there are cases where, say, a phantom-immutable list won't work, but an ordinary immutable list will. (For those interested in a short and elementary tutorial on how to implement access control with phantom types, look here: http://ocaml.janestreet.com/?q=node/11) y [-- Attachment #2: Type: text/html, Size: 2552 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 23:35 ` Yaron Minsky @ 2009-04-05 9:36 ` David Rajchenbach-Teller 2009-04-05 10:08 ` Alp Mestan 2009-04-05 21:40 ` Goswin von Brederlow 0 siblings, 2 replies; 47+ messages in thread From: David Rajchenbach-Teller @ 2009-04-05 9:36 UTC (permalink / raw) To: yminsky; +Cc: caml-list On Sat, 2009-04-04 at 19:35 -0400, Yaron Minsky wrote: > On Sat, Apr 4, 2009 at 5:51 PM, Goswin von Brederlow > <goswin-v-b@web.de> wrote: > Mutable/Immutable can really nicely done with phantom types > and is > independent of the data structure used. It works for strings, > lists, > arrays, sets, trees, ... and I think all standard modules > should have > it. The official standard lib is rather lacking there but that > is why > there is Batteries. The more I hear/see of it the more I like > it. > > On this note, there's a small variation on this idea that we've > experimented with at Jane Street that I think is worth mentioning. > When people do this kind of thing, they usually have two phantom tags, > "immutable" and "mutable", but, there is another natural one to add: > "readonly". Actually, thats exactly what we have for strings and arrays in Batteries. Cheers, David > ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 9:36 ` David Rajchenbach-Teller @ 2009-04-05 10:08 ` Alp Mestan 2009-04-05 21:41 ` Goswin von Brederlow 2009-04-05 21:40 ` Goswin von Brederlow 1 sibling, 1 reply; 47+ messages in thread From: Alp Mestan @ 2009-04-05 10:08 UTC (permalink / raw) To: David Rajchenbach-Teller, caml-list [-- Attachment #1: Type: text/plain, Size: 73 bytes --] By the way, what would be the use case of lazy strings ? -- Alp Mestan [-- Attachment #2: Type: text/html, Size: 98 bytes --] ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 10:08 ` Alp Mestan @ 2009-04-05 21:41 ` Goswin von Brederlow 0 siblings, 0 replies; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-05 21:41 UTC (permalink / raw) To: Alp Mestan; +Cc: David Rajchenbach-Teller, caml-list Alp Mestan <alp@mestan.fr> writes: > By the way, what would be the use case of lazy strings ? Strings larger than available memory or where only some small part of the string are used at any one execution. MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 9:36 ` David Rajchenbach-Teller 2009-04-05 10:08 ` Alp Mestan @ 2009-04-05 21:40 ` Goswin von Brederlow 1 sibling, 0 replies; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-05 21:40 UTC (permalink / raw) To: David Rajchenbach-Teller; +Cc: yminsky, caml-list David Rajchenbach-Teller <David.Teller@ens-lyon.org> writes: > On Sat, 2009-04-04 at 19:35 -0400, Yaron Minsky wrote: >> On Sat, Apr 4, 2009 at 5:51 PM, Goswin von Brederlow >> <goswin-v-b@web.de> wrote: >> Mutable/Immutable can really nicely done with phantom types >> and is >> independent of the data structure used. It works for strings, >> lists, >> arrays, sets, trees, ... and I think all standard modules >> should have >> it. The official standard lib is rather lacking there but that >> is why >> there is Batteries. The more I hear/see of it the more I like >> it. >> >> On this note, there's a small variation on this idea that we've >> experimented with at Jane Street that I think is worth mentioning. >> When people do this kind of thing, they usually have two phantom tags, >> "immutable" and "mutable", but, there is another natural one to add: >> "readonly". > > Actually, thats exactly what we have for strings and arrays in > Batteries. > > Cheers, > David Where do you have an immutable string? One where you are sure that nobody can hold a [> `Write] reference to the string. I only saw the read and write phantom flags. MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 21:51 ` Goswin von Brederlow 2009-04-04 23:35 ` Yaron Minsky @ 2009-04-05 2:55 ` Jon Harrop 2009-04-05 4:22 ` Edgar Friendly 2009-04-05 6:57 ` Goswin von Brederlow 1 sibling, 2 replies; 47+ messages in thread From: Jon Harrop @ 2009-04-05 2:55 UTC (permalink / raw) To: caml-list On Saturday 04 April 2009 22:51:50 Goswin von Brederlow wrote: > The beauty of ocaml strings is that they are really compact. An ocaml > string on 32bit is 5-8 bytes longer than the contained string and 9-16 > bytes on 64bit. The ugliness is that 16Mb limit. I assume those limits have been removed in batteries?! > As for char lists for strings you can always convert strings to char > lists when that is a better representation. I would rather use a view pattern to create substrings efficiently. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 2:55 ` Jon Harrop @ 2009-04-05 4:22 ` Edgar Friendly 2009-04-05 7:03 ` Goswin von Brederlow 2009-04-05 6:57 ` Goswin von Brederlow 1 sibling, 1 reply; 47+ messages in thread From: Edgar Friendly @ 2009-04-05 4:22 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list Jon Harrop wrote: > On Saturday 04 April 2009 22:51:50 Goswin von Brederlow wrote: >> The beauty of ocaml strings is that they are really compact. An ocaml >> string on 32bit is 5-8 bytes longer than the contained string and 9-16 >> bytes on 64bit. > > The ugliness is that 16Mb limit. I assume those limits have been removed in > batteries?! > Our implementation of ropes (well, really Mauricio Fernandez's) has a limit of ~700MB in 32-bit and 220GB in 64-bit. Users that need more than that will have to recompile batteries with a larger 'Rope.max-height' constant. E. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 4:22 ` Edgar Friendly @ 2009-04-05 7:03 ` Goswin von Brederlow 0 siblings, 0 replies; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-05 7:03 UTC (permalink / raw) To: Edgar Friendly; +Cc: Jon Harrop, caml-list Edgar Friendly <thelema314@gmail.com> writes: > Jon Harrop wrote: >> On Saturday 04 April 2009 22:51:50 Goswin von Brederlow wrote: >>> The beauty of ocaml strings is that they are really compact. An ocaml >>> string on 32bit is 5-8 bytes longer than the contained string and 9-16 >>> bytes on 64bit. >> >> The ugliness is that 16Mb limit. I assume those limits have been removed in >> batteries?! >> > Our implementation of ropes (well, really Mauricio Fernandez's) has a > limit of ~700MB in 32-bit and 220GB in 64-bit. Users that need more > than that will have to recompile batteries with a larger > 'Rope.max-height' constant. > > E. # Sys.max_string_length;; - : int = 144115188075855863 Why is the limit for rRope 220GB in 64bit? MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 2:55 ` Jon Harrop 2009-04-05 4:22 ` Edgar Friendly @ 2009-04-05 6:57 ` Goswin von Brederlow 2009-04-05 7:11 ` Jon Harrop 1 sibling, 1 reply; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-05 6:57 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list Jon Harrop <jon@ffconsultancy.com> writes: > On Saturday 04 April 2009 22:51:50 Goswin von Brederlow wrote: >> The beauty of ocaml strings is that they are really compact. An ocaml >> string on 32bit is 5-8 bytes longer than the contained string and 9-16 >> bytes on 64bit. > > The ugliness is that 16Mb limit. I assume those limits have been removed in > batteries?! As that is a limitation of the GC memory structure of a string there is nothing batteries can do there. Not for the basic type string. It is also just 32bit that is so severly limited. MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 6:57 ` Goswin von Brederlow @ 2009-04-05 7:11 ` Jon Harrop 0 siblings, 0 replies; 47+ messages in thread From: Jon Harrop @ 2009-04-05 7:11 UTC (permalink / raw) To: caml-list On Sunday 05 April 2009 07:57:56 Goswin von Brederlow wrote: > Jon Harrop <jon@ffconsultancy.com> writes: > > On Saturday 04 April 2009 22:51:50 Goswin von Brederlow wrote: > >> The beauty of ocaml strings is that they are really compact. An ocaml > >> string on 32bit is 5-8 bytes longer than the contained string and 9-16 > >> bytes on 64bit. > > > > The ugliness is that 16Mb limit. I assume those limits have been removed > > in batteries?! > > As that is a limitation of the GC memory structure of a string there > is nothing batteries can do there. Not for the basic type string. It > is also just 32bit that is so severly limited. Batteries could replace all uses of the built-in string type with another type, such as a byte big array, just as they have done to implement ropes. I don't think they could address the array limitation though, because that is polymorphic. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 9:14 ` David Rajchenbach-Teller 2009-04-04 9:26 ` Alp Mestan @ 2009-04-04 10:11 ` Jon Harrop 2009-04-04 21:39 ` Goswin von Brederlow 2009-04-05 7:14 ` Romain Beauxis 3 siblings, 0 replies; 47+ messages in thread From: Jon Harrop @ 2009-04-04 10:11 UTC (permalink / raw) To: caml-list On Saturday 04 April 2009 10:14:34 David Rajchenbach-Teller wrote: > Note that Batteries provides > * regular OCaml strings > * strings with capabilities (i.e. strings which, depending on their > type, can be read-only/write-only/read-write) -- sometimes faster than > regular strings, never slower > * immutable Unicode ropes. Wow! -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 9:14 ` David Rajchenbach-Teller 2009-04-04 9:26 ` Alp Mestan 2009-04-04 10:11 ` Jon Harrop @ 2009-04-04 21:39 ` Goswin von Brederlow 2009-04-05 7:14 ` Romain Beauxis 3 siblings, 0 replies; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-04 21:39 UTC (permalink / raw) To: David Rajchenbach-Teller; +Cc: Jon Harrop, caml-list David Rajchenbach-Teller <David.Teller@ens-lyon.org> writes: > I personally can't remember the last time I've needed mutable strings in > OCaml. Only for byte arrays or a buffering module where I String.blit to the string itself instead of into a new string. Saves a mutable. MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-04 9:14 ` David Rajchenbach-Teller ` (2 preceding siblings ...) 2009-04-04 21:39 ` Goswin von Brederlow @ 2009-04-05 7:14 ` Romain Beauxis 2009-04-05 9:34 ` David Rajchenbach-Teller 2009-04-05 21:37 ` Goswin von Brederlow 3 siblings, 2 replies; 47+ messages in thread From: Romain Beauxis @ 2009-04-05 7:14 UTC (permalink / raw) To: caml-list Le Saturday 04 April 2009 11:14:34 David Rajchenbach-Teller, vous avez écrit : > Note that Batteries provides > * regular OCaml strings > * strings with capabilities (i.e. strings which, depending on their > type, can be read-only/write-only/read-write) -- sometimes faster than > regular strings, never slower > * immutable Unicode ropes. Impressive ! I am taking this opportunity to ask wether there exists a String-compatible API that would in fact manipulate lazy strings, i.e. strings which content is read on-the-fly ? Romain ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 7:14 ` Romain Beauxis @ 2009-04-05 9:34 ` David Rajchenbach-Teller 2009-04-05 21:37 ` Goswin von Brederlow 1 sibling, 0 replies; 47+ messages in thread From: David Rajchenbach-Teller @ 2009-04-05 9:34 UTC (permalink / raw) To: Romain Beauxis; +Cc: caml-list Er... not yet :) On Sun, 2009-04-05 at 09:14 +0200, Romain Beauxis wrote: > Impressive ! > > I am taking this opportunity to ask wether there exists a String-compatible > API that would in fact manipulate lazy strings, i.e. strings which content is > read on-the-fly ? > > > Romain ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Caml-list] Strings 2009-04-05 7:14 ` Romain Beauxis 2009-04-05 9:34 ` David Rajchenbach-Teller @ 2009-04-05 21:37 ` Goswin von Brederlow 1 sibling, 0 replies; 47+ messages in thread From: Goswin von Brederlow @ 2009-04-05 21:37 UTC (permalink / raw) To: Romain Beauxis; +Cc: caml-list Romain Beauxis <toots@rastageeks.org> writes: > Le Saturday 04 April 2009 11:14:34 David Rajchenbach-Teller, vous avez écrit : >> Note that Batteries provides >> * regular OCaml strings >> * strings with capabilities (i.e. strings which, depending on their >> type, can be read-only/write-only/read-write) -- sometimes faster than >> regular strings, never slower >> * immutable Unicode ropes. > > Impressive ! > > I am taking this opportunity to ask wether there exists a String-compatible > API that would in fact manipulate lazy strings, i.e. strings which content is > read on-the-fly ? > > > Romain mmap? MfG Goswin ^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2009-04-18 12:31 UTC | newest] Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-04-03 11:56 Strings Jon Harrop 2009-04-03 12:25 ` [Caml-list] Strings Paolo Donadeo 2009-04-03 14:18 ` Ashish Agarwal 2009-04-03 14:46 ` Jon Harrop 2009-04-03 15:03 ` Daniel Bünzli 2009-04-03 16:52 ` Martin Jambon 2009-04-03 17:50 ` Daniel Bünzli 2009-04-03 19:46 ` Paolo Donadeo 2009-04-03 20:41 ` Harrison, John R 2009-04-04 10:11 ` Jon Harrop 2009-04-04 11:12 ` David Teller 2009-04-04 11:40 ` Jon Harrop 2009-04-04 12:34 ` David Rajchenbach-Teller 2009-04-18 12:31 ` Arkady Andrukonis 2009-04-04 10:13 ` Jon Harrop 2009-04-03 21:44 ` Goswin von Brederlow 2009-04-04 9:10 ` David Rajchenbach-Teller 2009-04-05 10:06 ` Strings Zheng Li 2009-04-06 9:20 ` Strings David Rajchenbach-Teller 2009-04-06 10:07 ` Strings Goswin von Brederlow 2009-04-06 11:03 ` Strings Zheng Li 2009-04-04 17:11 ` [Caml-list] Strings Kuba Ober 2009-04-04 17:26 ` Jon Harrop 2009-04-05 20:54 ` Richard Jones 2009-04-05 23:40 ` Daniel Bünzli 2009-04-03 18:24 ` Florian Hars 2009-04-03 20:34 ` Arnaud Spiwack 2009-04-04 10:20 ` Jon Harrop 2009-04-04 9:14 ` David Rajchenbach-Teller 2009-04-04 9:26 ` Alp Mestan 2009-04-04 10:55 ` blue storm 2009-04-04 21:51 ` Goswin von Brederlow 2009-04-04 23:35 ` Yaron Minsky 2009-04-05 9:36 ` David Rajchenbach-Teller 2009-04-05 10:08 ` Alp Mestan 2009-04-05 21:41 ` Goswin von Brederlow 2009-04-05 21:40 ` Goswin von Brederlow 2009-04-05 2:55 ` Jon Harrop 2009-04-05 4:22 ` Edgar Friendly 2009-04-05 7:03 ` Goswin von Brederlow 2009-04-05 6:57 ` Goswin von Brederlow 2009-04-05 7:11 ` Jon Harrop 2009-04-04 10:11 ` Jon Harrop 2009-04-04 21:39 ` Goswin von Brederlow 2009-04-05 7:14 ` Romain Beauxis 2009-04-05 9:34 ` David Rajchenbach-Teller 2009-04-05 21:37 ` Goswin von Brederlow
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox