From: Basile STARYNKEVITCH <basile@starynkevitch.net>
To: cashin@cs.uga.edu
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Unix.lseek versus Pervasives.pos
Date: Wed, 19 Mar 2003 20:08:08 +0100 [thread overview]
Message-ID: <15992.49176.43893.768644@hector.lesours> (raw)
In-Reply-To: <877kavryp3.fsf@cs.uga.edu>
>>>>> "cashin" == cashin <cashin@cs.uga.edu> writes:
cashin> Sorry if this shows up as a duplicate. Basile
cashin> STARYNKEVITCH <basile@starynkevitch.net> writes:
Basile>> You apparently forgot to flush the channel.
Ok, I made a stupid mistake (flushing is only for writes!) but my
intuition was right, in the sense of taking buffering into account.
cashin> Flushes are for writes, but even when using a test program
cashin> that just reads, zero is returned when it appears that it
cashin> shouldn't return zero. Compare the short ocaml program
cashin> below to the comparable C version.
Ok; but the problem is the same: Ocaml I/O subsystem manage internal
buffering. Channels are not Unix filedescriptors, but a buffering of
these. See the source code (in particular ocaml/byterun/io.c and io.h) for
details. In particular, a channel is (from io.h) implemented as
struct channel {
int fd; /* Unix file descriptor */
file_offset offset; /* Absolute position of fd in the file */
char * end; /* Physical end of the buffer */
char * curr; /* Current position in the buffer */
char * max; /* Logical end of the buffer (for input) */
void * mutex; /* Placeholder for mutex (for systhreads) */
struct channel * next; /* Linear chaining of channels (flush_all) */
int revealed; /* For Cash only */
int old_revealed; /* For Cash only */
int refcount; /* For flush_all and for Cash */
char buff[IO_BUFFER_SIZE]; /* The buffer itself */
};
where IO_BUFFER_SIZE is usually 4096 bytes.
The equivalent C library would mix lseek with <stdio.h> FILE, and also
get a mess:
/* file main.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main(void)
{
FILE *f = fopen("main.c", "r");
char buf[1024];
int fd = fileno(f);
memset(buf, '\0', sizeof(buf));
fread(buf, 1, 10, f);
printf("after reading \"%s\" lseek returns %d\n",
buf, (int) lseek(fd, 0, SEEK_CUR));
return 0;
}
When I run above file with tcc (www.tinycc.org) I get
after reading " /* file " lseek returns 483
which is messy as I was expecting.
In a short sentence, never mix Unix.read (or other Unix IO) &
Pervasive.* channel operations.
As usual with advices, it is a "don't do what I did" advice; shame on
me :-( I must admit that I once did open a channel and then only do
Unix.read operations on it, but I commented this code (opensource code
in Poesia monitor) with
(** IMPORTANT NOTICE: here outputxchannel_t-s are only used for their
Unix file descriptor; no output takes actually place on the output
channel; all output is thru Unix.write *)
and later
(** the reply channel from filter to monitor [don't use the
Pervasives.channel; using Unix] *)
The bad reasons for mixing channels & unix file descriptors (beside
perhaps a design bug) is that I use nonblocking unix IO and that I
want precise control over the actual read & write system calls -so I
don't want extra buffering-
--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net
aliases: basile<at>tunes<dot>org = bstarynk<at>nerim<dot>net
8, rue de la Faïencerie, 92340 Bourg La Reine, France
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2003-03-19 19:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-03-19 18:36 cashin
2003-03-19 18:48 ` Nicolas George
2003-03-19 19:01 ` cashin
2003-03-19 18:55 ` Ken Rose
2003-03-19 19:08 ` Basile STARYNKEVITCH [this message]
[not found] <46CF368E-5912-11D7-8289-000A95773ED2@rouaix.org>
2003-03-18 17:35 ` Shivkumar Chandrasekaran
2003-03-18 17:39 ` Shivkumar Chandrasekaran
2003-03-19 20:27 ` Xavier Leroy
-- strict thread matches above, loose matches on Subject: below --
2003-03-17 22:45 Shivkumar Chandrasekaran
2003-03-18 6:54 ` Basile STARYNKEVITCH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15992.49176.43893.768644@hector.lesours \
--to=basile@starynkevitch.net \
--cc=caml-list@inria.fr \
--cc=cashin@cs.uga.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox