Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Alain Frisch <alain.frisch@lexifi.com>
To: matthieu.dubuget@gmail.com, Caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] Looking for a windows ocaml UTF-16 encoded filename aware library
Date: Fri, 5 Feb 2016 12:01:54 +0100	[thread overview]
Message-ID: <56B48122.30303@lexifi.com> (raw)
In-Reply-To: <56B47F51.5030001@gmail.com>

Hello,

The real solution is to fix OCaml so that it can interact properly with 
arbitrary filenames under Windows. See:

https://github.com/ocaml/ocaml/pull/153
http://caml.inria.fr/mantis/view.php?id=3771

The basic idea is that filenames are represented by OCaml strings 
representing an utf-8 encoding of the actual filename.  To reduce code 
breakage, a fallback interprets strings that are invalid utf-8 sequences 
using the current code page.  But this is still a rather intrusive 
change, since filenames received from readdir are always utf-8 encoded, 
which can break existing code.  (One could imagine providing two 
variants of readdir to smooth the migration path.)

Any help reviewing and testing the PR above would be very much appreciated!

Alain


On 05/02/2016 11:54, Matthieu Dubuget wrote:
> Hello,
>
> I'm currently analysing a NTFS file-tree with a windows OCaml native application.
>
> This application is using:
> - Unix.{opendir,readdir,closedir}
> - and Unix.LargeFile.lstat
>
> The unix library of OCaml distribution is using ANSI variants of system functions. This is working fine until files or directories whose UTF-16 encoded name cannot be converted into the code page in use are reached.
>
> I'm about to write a small library to solve this problem: it would mimic the corresponding code from OCaml unix library, but using WIDE variants of microsoft system functions in the C stub instead of ANSI variants.
>
> Before going on: do you know of any library that already do this I could use?
>
> Thanks for any link.
>
> Salutations
>

  reply	other threads:[~2016-02-05 11:02 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-05 10:54 Matthieu Dubuget
2016-02-05 11:01 ` Alain Frisch [this message]
2016-02-05 11:09 ` Bob Atkey
2016-02-05 15:14   ` Matthieu Dubuget
2016-02-09 11:10     ` Adrien Nader

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B48122.30303@lexifi.com \
    --to=alain.frisch@lexifi.com \
    --cc=caml-list@inria.fr \
    --cc=matthieu.dubuget@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox