From: Bob Atkey <bob.atkey@gmail.com>
To: caml-list@inria.fr, matthieu.dubuget@gmail.com
Subject: Re: [Caml-list] Looking for a windows ocaml UTF-16 encoded filename aware library
Date: Fri, 5 Feb 2016 11:09:58 +0000 [thread overview]
Message-ID: <56B48306.9060604@gmail.com> (raw)
In-Reply-To: <56B47F51.5030001@gmail.com>
Hi Matthieu,
I wrote a little C binding to do pretty much what you are asking:
https://github.com/ContemplateLtd/filesystem-wrapper
My motivation was to be able to support long filenames (> 240 chars) on
Windows, but this entails using the wide versions of the filesystem
functions.
I based it on the patch that Alain posted a link to, but only supported
the operations that we needed (openfile, opendir, readdir, closedir and
is_directory). I also had to use an abstract type for pathnames to be
able to handle the bizarre way that windows does long file names (you
have to prefix the absolute name with '\\?\', as far as I can tell).
There is a little Makefile that assumes you are cross-compiling from
Linux with the Debian-packaged cross compiler.
I am completely inexpert in Windows programming, so there are almost
certainly bugs in it. It has been reasonably well tested with long
filenames (we were doing static analysis of Java .class files, some of
which are auto generated from XML Schemas and can have very long names),
but I haven't tested it much on non-ASCII names. It converts back and
forth between UTF-16 for Windows to UTF-8 for OCaml.
As Alain says, the full solution would be to fix OCaml itself.
Bob
On 05/02/16 10:54, Matthieu Dubuget wrote:
> Hello,
>
> I'm currently analysing a NTFS file-tree with a windows OCaml native application.
>
> This application is using:
> - Unix.{opendir,readdir,closedir}
> - and Unix.LargeFile.lstat
>
> The unix library of OCaml distribution is using ANSI variants of system functions. This is working fine until files or directories whose UTF-16 encoded name cannot be converted into the code page in use are reached.
>
> I'm about to write a small library to solve this problem: it would mimic the corresponding code from OCaml unix library, but using WIDE variants of microsoft system functions in the C stub instead of ANSI variants.
>
> Before going on: do you know of any library that already do this I could use?
>
> Thanks for any link.
>
> Salutations
>
next prev parent reply other threads:[~2016-02-05 11:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-05 10:54 Matthieu Dubuget
2016-02-05 11:01 ` Alain Frisch
2016-02-05 11:09 ` Bob Atkey [this message]
2016-02-05 15:14 ` Matthieu Dubuget
2016-02-09 11:10 ` Adrien Nader
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B48306.9060604@gmail.com \
--to=bob.atkey@gmail.com \
--cc=caml-list@inria.fr \
--cc=matthieu.dubuget@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox