From: "Gerd Stolpmann" <info@gerd-stolpmann.de>
To: "Enrico Tassi" <enrico.tassi@inria.fr>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] unmarshaling large data from string on 32 bits
Date: Thu, 5 Feb 2015 00:51:24 +0100 [thread overview]
Message-ID: <6395dd48f0abe859ff44f5095631d32c.squirrel@gps.dynxs.de> (raw)
In-Reply-To: <20150204164702.GA14942@birba.invalid>
What about this: you change the protocol so that there is a single
character, say an 'X', before any marshalled value. The 'X' is something
you can use for non-blocking reads. So (if ch is the input channel):
Unix.set_nonblock (Unix.descr_of_in_channel ch);
let x = input_char ch in (* or Sys_blocked_io *)
assert(x = 'X');
Unix.clear_nonblock (Unix.descr_of_in_channel ch);
let v = Marshal.from_channel ch
This will also work when there are several messages in the input buffer,
as input_char then simply succeeds. If you get a Sys_blocked_io, you can
even revert to using select() because you know that the buffer is empty
then.
Gerd
> On Mon, Feb 02, 2015 at 01:00:53PM +0100, Gabriel Scherer wrote:
>> If you don't mind going through a temporary file,
>> Marshal.{to,from}_channel should work fine.
>
> Thanks for the suggestion, if Windows is as smart as Linux than a
> tmpfile should work fine. If not, well, better than nothing.
>
>> You should consider opening a problem report to OCaml upstream (
>> http://caml.inria.fr/mantis/ ) explaining the use-case and asking for
>> a large-string-safe API (eg. taking and returning lists of strings).
>
> The chain of workarounds that leads here is long an ugly :-/
>
> 1. I have a problems with threads on Windows and (rarely) on Linux.
> The model is simple, Coq sits between 1 user interface and many
> (usually only 1) worker process. Coq's main thread talks to the
> UI via a socket and does blocking calls; worker manager
> threads (1 per worker) do the same with their respective workers.
> At some point all threads are blocked reading. Then
> a worker process writes data but no thread is woken up.
> On Linux I need at least 2 worker manager threads to see the problem,
> on Windows 1 is enough. All that using the channels API and Marshal.
>
> OK, I say, let's go back to the old good Unix.select to read only when
> some
> data is there.
>
> 2. The Unix module lets you get the fd number associated to the channel
> and you can use Unix.select with it. And you can still use the
> channels
> API to Marshal.from_channel. Looks good but I still a problem. I have
> LARGE and small messages. The small ones fit, largely, in the
> channels buffer. Result: you have 2 "values" in the buffers of the
> OS. Select tells you that you can read. You Marshal.from_channel.
> Both values are moved in the channel buffer, but clearly
> "input_value" reads only the first one. You select again, but this
> time the OS buffers are empty. So you wait until next message
> arrives to discover the one forgotten in the channel buffer.
>
> I can't bet all my money on the correctness of this diagnoses,
> but that seemed the cause at the time. Artificially inflating
> messages was working, but this is not what you want. There is no
> API, at least in 3.12, to peek a channel and see if there is
> data (and if so, don't call select). I tried with non blocking
> channels, but I could not succeed using input_value there (I don't
> recall if input_value is always blocking or something else went
> wrong).
>
> OK, I say, let's not use the channels and do old good Unix select and
> read. Unfortunately the size of buffers, strings, is limited and the
> LARGE messages I have do not fit.
>
> So yes, Marshal.from_string_list would be an option here.
>
> I still have around a simple example that locks up on Windows,
> I'll open a bug for that.
>
> Best,
> --
> Enrico Tassi
>
> --
> Caml-list mailing list. Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
--
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de
My OCaml site: http://www.camlcity.org
Contact details: http://www.camlcity.org/contact.html
Company homepage: http://www.gerd-stolpmann.de
------------------------------------------------------------
next prev parent reply other threads:[~2015-02-04 23:51 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-02 10:32 Enrico Tassi
2015-02-02 12:00 ` Gabriel Scherer
2015-02-02 13:08 ` Pierre-Marie Pédrot
2015-02-04 16:47 ` Enrico Tassi
2015-02-04 23:51 ` Gerd Stolpmann [this message]
2015-02-05 8:56 ` Alain Frisch
2015-02-05 9:01 ` Gabriel Scherer
2015-02-05 9:34 ` Alain Frisch
2015-02-05 9:58 ` Pierre-Marie Pédrot
2015-02-05 10:33 ` Enrico Tassi
2015-02-05 10:50 ` Alain Frisch
2015-02-05 12:22 ` Fabrice Le Fessant
2015-02-05 12:24 ` Alain Frisch
2015-02-05 12:27 ` Enrico Tassi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6395dd48f0abe859ff44f5095631d32c.squirrel@gps.dynxs.de \
--to=info@gerd-stolpmann.de \
--cc=caml-list@inria.fr \
--cc=enrico.tassi@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox