From: David Allsopp <dra-news@metastack.com>
To: Paul Steckler <steck@stecksoft.com>,
"caml-list@yquem.inria.fr" <caml-list@yquem.inria.fr>
Subject: RE: [Caml-list] Windows filenames and Unicode
Date: Wed, 29 Sep 2010 06:23:04 +0000 [thread overview]
Message-ID: <E51C5B015DBD1348A1D85763337FB6D92AEAD1@Remus.metastack.local> (raw)
In-Reply-To: <AANLkTikXYCdGHBzQ0G4mRbrWcA245K5oOp1CZay-OYoT@mail.gmail.com>
Paul Steckler wrote:
> In Windows, NTFS filenames are specified in Unicode (UTF-16). Am I right
> in thinking that OCaml file primitives, like open_in, readdir, etc. cannot
> handle NTFS filenames containing characters with codepoints greater than
> 255?
Given that the WinAPI "wide" functions use UTF-16, you can of course fake UTF-16 on top of normal OCaml strings but I think that you'll hit a brick wall because the I/O primitives are based on the underlying C library functions which at the end of the day will be using the ANSI versions of the Windows API functions, not the Unicode ones.
> I'm aware of the Camomile library, which gives the ability to manipulate
> UTF-16 strings inside of OCaml. But it looks like crucial points of
> OCaml's I/O, like Sys.argv and file primitives are strictly limited to 8-
> bit characters.
>
> Is there a way around this limitation, other than rewriting the file I/O
> primitives?
A way (but not foolproof on Windows 7 and Windows 2008 R2 because you can disable it) would be to wrap the GetShortPathName Windows API function[1] which will convert the pathname to its DOS 8.3 format which will not contain Unicode characters. Another way might be to wrap the Unicode version of CreateFileEx and convert the result into a handle compatible with the standard library functions but I reckon that could be tricky!
David
[1] http://msdn.microsoft.com/en-us/library/aa364989(v=VS.85).aspx
>
> -- Paul
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
next prev parent reply other threads:[~2010-09-29 6:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-29 5:05 Paul Steckler
2010-09-29 6:23 ` David Allsopp [this message]
2010-09-29 7:26 ` [Caml-list] " Paul Steckler
2010-09-29 7:56 ` Michael Ekstrand
2010-09-29 7:58 ` David Allsopp
2010-09-29 8:14 ` Jerome Vouillon
2010-09-30 19:27 ` ygrek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E51C5B015DBD1348A1D85763337FB6D92AEAD1@Remus.metastack.local \
--to=dra-news@metastack.com \
--cc=caml-list@yquem.inria.fr \
--cc=steck@stecksoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox