From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id F2B827FD94 for ; Fri, 5 Feb 2016 12:10:03 +0100 (CET) IronPort-PHdr: 9a23:j7u1ehRozEAFDkJfEWqqjkphodpsv+yvbD5Q0YIujvd0So/mwa64YhWN2/xhgRfzUJnB7Loc0qyN4/+mAzZLuMfJmUtBWaIPfidNsd8RkQ0kDZzNImzAB9muURYHGt9fXkRu5XCxPBsdMs//Y1rPvi/6tmZKSV3BPAZ4bt74BpTVx5zukbvipNuJOU4R1XKUWvBbElaflU3prM4YgI9veO4a6yDihT92QdlQ3n5iPlmJnhzxtY+a9Z9n9DlM6bp6r5YTGfayQ6NtRrVdCHEiMnspzMztrxjKCwWVojMZW3kKkhtFHk7J8RvnUZrtmiT/v+t5niKdOJ7YV7cxDB6l5e9CVBzlmW9TPTkztmjLicFhpK1eqROl4Rd4xtiHM8muKPNic/aFLpshTm1bU5MJWg== Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=bob.atkey@gmail.com; spf=Pass smtp.mailfrom=bob.atkey@gmail.com; spf=None smtp.helo=postmaster@mail-wm0-f53.google.com Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of bob.atkey@gmail.com) identity=pra; client-ip=74.125.82.53; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="bob.atkey@gmail.com"; x-sender="bob.atkey@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of bob.atkey@gmail.com designates 74.125.82.53 as permitted sender) identity=mailfrom; client-ip=74.125.82.53; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="bob.atkey@gmail.com"; x-sender="bob.atkey@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-wm0-f53.google.com) identity=helo; client-ip=74.125.82.53; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="bob.atkey@gmail.com"; x-sender="postmaster@mail-wm0-f53.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0DHAwDGgrRWlDVSfUpehAxthC+ELJ8rhxGKX4FmIYQ8gTACgTg6EgEBAQEBAQEBEAEBAQEHCwsJHzCCLYIVAQEEEhEVCAEbHAIDDAYFCw0CAgUWCwICCQMCAQIBEREBBQEOAQ0HDAgBARcHh2MBAxIOoxWBMT4xizSBaYJXhQ4KGScNUYN2AQEBAQEBAQMBAQEBAQEBAQERAQEECgRthReEN4cygToFjSeJToVMiASJEw6FUo0CL4ENJwuCJh4cgTRqiXwBAQE X-IPAS-Result: A0DHAwDGgrRWlDVSfUpehAxthC+ELJ8rhxGKX4FmIYQ8gTACgTg6EgEBAQEBAQEBEAEBAQEHCwsJHzCCLYIVAQEEEhEVCAEbHAIDDAYFCw0CAgUWCwICCQMCAQIBEREBBQEOAQ0HDAgBARcHh2MBAxIOoxWBMT4xizSBaYJXhQ4KGScNUYN2AQEBAQEBAQMBAQEBAQEBAQERAQEECgRthReEN4cygToFjSeJToVMiASJEw6FUo0CL4ENJwuCJh4cgTRqiXwBAQE X-IronPort-AV: E=Sophos;i="5.22,399,1449529200"; d="scan'208";a="201477465" Received: from mail-wm0-f53.google.com ([74.125.82.53]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 05 Feb 2016 12:10:01 +0100 Received: by mail-wm0-f53.google.com with SMTP id r129so21706576wmr.0 for ; Fri, 05 Feb 2016 03:10:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-type:content-transfer-encoding; bh=AjWP8weTjPyBqXpdqRatn5PTX5yl+NbIVe4D8148tA8=; b=iIIuyv0Frk1OkhOtwkY1jjBQ1UsW+GAth7PHnLQUHMleV1kv3X7UVTGNqJ/RUmmO9m WbMRFhfNMrm+KzBAZd3y5hE+331Mh8jz6ZzoFbcHzSiFm2+sRLAU/0bhRCLtA36vaUpJ 6cVYP7PMqmD0O/gpNOZmEFbzJGpuw0jSE+9YYKfiJ2YoFFgPDLpIaU28quc3F9K/Dlym 3PL+UK+fw5QIcGC7ecOm8n1RjsYUduZ6/cZle3QZsoI8qADjJtYdbghRGAAPz5IyvOUO BCjrPIFwpmzibEZBRUgSRWMW/sYaFqRi8BUoiNB/w/3l+FdASNE575mo2KWJpWrR5Snp J8gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=AjWP8weTjPyBqXpdqRatn5PTX5yl+NbIVe4D8148tA8=; b=ccXNzuRkaPQ+onggyv2qwyqWsx+DZAUH2HF0kHuNWO5fegnW3Mwb9tb+av3hSEgzat t+XVA+U9h1IJ/OiAKywXITPOEdvwNn64uTwq/abxS+ycbbM3aXI0mHMTc9p2YmG+UOBA kq4G048h/Othwh5AAwdhMp9ygISWXet6uNUwjqy6bT/eYc+qpUjkogrzVawE3Jc/1JDS E77j3casN0zX4nXuiRK1UNG1p/XYXdgTtvpzJygHx9APpL3ZPoSGdEybNtFHjOqT5GEJ wSXIzgcEZviAr3B9uACsdYypnWvKg7mHoDcZSyQt3peXnRuuzY8TcyuqlBluxvcuJDmv pXdA== X-Gm-Message-State: AG10YOTHNYwsUykXa7yMlmBAWIYP3MdtmBE+Bh7YHoFVaeaRVBR4fpZPBUI6rq2bHJ6/rQ== X-Received: by 10.28.50.137 with SMTP id y131mr38095332wmy.102.1454670601290; Fri, 05 Feb 2016 03:10:01 -0800 (PST) Received: from [130.159.227.187] ([130.159.227.187]) by smtp.googlemail.com with ESMTPSA id l7sm15445628wjx.14.2016.02.05.03.10.00 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 05 Feb 2016 03:10:00 -0800 (PST) To: caml-list@inria.fr, matthieu.dubuget@gmail.com References: <56B47F51.5030001@gmail.com> From: Bob Atkey Message-ID: <56B48306.9060604@gmail.com> Date: Fri, 5 Feb 2016 11:09:58 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56B47F51.5030001@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Caml-list] Looking for a windows ocaml UTF-16 encoded filename aware library Hi Matthieu, I wrote a little C binding to do pretty much what you are asking: https://github.com/ContemplateLtd/filesystem-wrapper My motivation was to be able to support long filenames (> 240 chars) on Windows, but this entails using the wide versions of the filesystem functions. I based it on the patch that Alain posted a link to, but only supported the operations that we needed (openfile, opendir, readdir, closedir and is_directory). I also had to use an abstract type for pathnames to be able to handle the bizarre way that windows does long file names (you have to prefix the absolute name with '\\?\', as far as I can tell). There is a little Makefile that assumes you are cross-compiling from Linux with the Debian-packaged cross compiler. I am completely inexpert in Windows programming, so there are almost certainly bugs in it. It has been reasonably well tested with long filenames (we were doing static analysis of Java .class files, some of which are auto generated from XML Schemas and can have very long names), but I haven't tested it much on non-ASCII names. It converts back and forth between UTF-16 for Windows to UTF-8 for OCaml. As Alain says, the full solution would be to fix OCaml itself. Bob On 05/02/16 10:54, Matthieu Dubuget wrote: > Hello, > > I'm currently analysing a NTFS file-tree with a windows OCaml native application. > > This application is using: > - Unix.{opendir,readdir,closedir} > - and Unix.LargeFile.lstat > > The unix library of OCaml distribution is using ANSI variants of system functions. This is working fine until files or directories whose UTF-16 encoded name cannot be converted into the code page in use are reached. > > I'm about to write a small library to solve this problem: it would mimic the corresponding code from OCaml unix library, but using WIDE variants of microsoft system functions in the C stub instead of ANSI variants. > > Before going on: do you know of any library that already do this I could use? > > Thanks for any link. > > Salutations >