From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA10977; Thu, 29 Apr 2004 19:31:55 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA10989 for ; Thu, 29 Apr 2004 19:31:54 +0200 (MET DST) X-SPAM-Warning: Sending machine is listed in blackholes.five-ten-sg.com Received: from wetware.com (wetware.wetware.com [199.108.16.1]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i3THVrSH015491 for ; Thu, 29 Apr 2004 19:31:53 +0200 Received: from [208.177.152.17] (helo=[10.0.1.5]) by wetware.com with esmtp (Exim 4.20) id 1BJFO4-00071O-Ll for caml-list@inria.fr; Thu, 29 Apr 2004 10:31:52 -0700 Mime-Version: 1.0 (Apple Message framework v613) In-Reply-To: <20040430.003122.28779093.yoriyuki@mbg.ocn.ne.jp> References: <20040429130335.GA11323@excelhustler.com> <20040429.224036.75426712.yoriyuki@mbg.ocn.ne.jp> <20040429140240.GA12640@excelhustler.com> <20040430.003122.28779093.yoriyuki@mbg.ocn.ne.jp> Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Message-Id: <19A2CD82-9A03-11D8-B697-000A958FF2FE@wetware.com> Content-Transfer-Encoding: quoted-printable From: james woodyatt Subject: Re: [Caml-list] Re: Common IO structure Date: Thu, 29 Apr 2004 10:31:51 -0700 To: caml-list Trade X-Mailer: Apple Mail (2.613) X-Miltered: at concorde with ID 40913C09.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; woodyatt:01 jhw:01 wetware:01 caml-list:01 yamagata:01 yoriyuki:01 stateful:01 eol:01 stateful:01 non-blocking:01 non-blocking:01 abort:01 glue:01 endline:01 api:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On 29 Apr 2004, at 08:31, Yamagata Yoriyuki wrote: > Encoding could be stateful, so there would be no single representation > of EOL. (*) Ok, this is very unlikely case currently, but I think=20 > there > is an interesting encoding for Unicode which is fully stateful. So, > readlines() needs to fully aware of the encoding. This transcoding I/O channel under discussion is required to contain=20 internal state for other reasons. With non-blocking I/O, an underlying=20= transport may present only those octets that are ready for reading,=20 which may leave a codepoint incomplete at the end of the currently=20 received octets. Even without non-blocking I/O, a read can be=20 interrupted by a system signal event and still return less than the=20 number of octets requested. It is not sufficient to defer signal=20 processing until after the read completes=97 sometimes (but not always),=20= a signal explicitly means to abort reading immediately. > My proposal is mainly for sharing common channel types among > libraries, so that a user can pass a channel from a libraries to > anonther withoug writing a glue code. Since parsing endline, or > loading the whole file into the string mainly occurs in the endpoint > of IO, I do not think standardizing them are necessary for this > purpose. > > I do not think standardizing the endpoint API is important, because I > think that in the end, we will use only one library as the endpoint of > IO. Most of us. Some of us have other concerns that I don't see anyone=20 else trying to address. At some point, probably soon, I will be=20 writing a wrapper around OpenSSL. I need non-blocking I/O. I need to=20= parse XML documents of unbounded length, which means using a SAX-like=20 parser (I have that now). I need to be able to parse an arbitrary=20 number of XML documents simultaneously. In potentially any of the=20 legal Unicode transfer encodings. And I need to be responsive to=20 events in near real-time. I have the "control inversion" nightmare from hell. That's why I have=20= forced myself to learn functional programming techniques. An I/O library that I can use is simply not going to be something that=20= can satisfy Richard's requirement that he be able to slurp a whole file=20= into an application data structure with a single line of code. So I'm=20= writing my own. Richard will be appalled by how it works. So I'm watching this discussion with a certain bemused detachment: I=20 wonder what new and improved API will be coming from this that I will=20 still find inadequate for my tasks. > (*) IIRC, RFC defines the endianness of UTF-16 is swapped in the > middle of the stream, when "BOM" 0xfffe appears. This is quite true. Happens all the time, too. --=20 j h woodyatt markets are only free to the people who own them.= ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners