From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 399A0BC57 for ; Sun, 11 Jul 2010 00:51:49 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqcCALqVOExKfVM2kGdsb2JhbACgRAgVAQEBAQkJDAcRAx+9Kg2CC4MPBIN7Iw X-IronPort-AV: E=Sophos;i="4.55,181,1278280800"; d="scan'208";a="53955058" Received: from mail-gw0-f54.google.com ([74.125.83.54]) by mail3-smtp-sop.national.inria.fr with ESMTP; 11 Jul 2010 00:51:48 +0200 Received: by gwb15 with SMTP id 15so2550921gwb.27 for ; Sat, 10 Jul 2010 15:51:47 -0700 (PDT) Received: by 10.101.195.2 with SMTP id x2mr13483213anp.219.1278802307049; Sat, 10 Jul 2010 15:51:47 -0700 (PDT) Received: from alavrik (c-98-201-166-40.hsd1.tx.comcast.net [98.201.166.40]) by mx.google.com with ESMTPS id u14sm29707569ann.0.2010.07.10.15.51.45 (version=SSLv3 cipher=RC4-MD5); Sat, 10 Jul 2010 15:51:46 -0700 (PDT) Sender: Anton Lavrik Date: Sat, 10 Jul 2010 17:51:29 -0500 From: Anton Lavrik To: caml-list@inria.fr Subject: Announce: the Piqi project Message-ID: <20100710225129.GA22586@alavrik> Reply-To: Anton Lavrik MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) X-Spam: no; 0.00; high-level:01 serialize:01 deserialize:01 implements:01 ocaml:01 syntax:01 syntax:01 literals:01 booleans:01 byte:01 arrays:01 encodings:01 variants:01 ocaml's:01 variants:01 It is my pleasure to announce the Piqi project. Piqi is a set of languages and tools for working with structured data. It includes: - High-level data representation and data definition languages. - Binary encoding for compact and portable data representation. - Tools for validating, pretty-printing and converting data between different formats. - Mappings to various programing languages allowing programs to serialize and deserialize data in a portable manner. - Open-source implementation licensed under the terms of Apache v2.0. Piqi implements native mapping for OCaml. Mappings for other languages (C++, Python, Java, Go, etc.) are supported through Google Protocol Buffers. Here are some details. 1. Data representation language (Piq) Piq is a text-based language that allows humans to conveniently read, write and edit structured data. Piq has a concise and powerful syntax: - No syntax noise compared to XML. - Reasonable amount of parenthesis compared to S-expressions. - Comments. Piq supports the following data literals: - Unicode strings, booleans, integer and floating point numbers (including "infinity" and "NaN"); - binary strings (byte arrays); - verbatim text. 2. Data definition language (Piqi) Piqi is a data definition language for Piq and its encodings. It stands for "Piq Interfaces". Piqi supports definition of records, variants (similar to OCaml's polymorphic variants), lists and type aliases on top of a rich set of built-in data types. Piqi modules can reuse definitions from other modules using OCaml-like imports and includes. Piqi type definitions are extensible that allows data schema evolution while maintaining backward and forward compatibility. 3. Piqi-OCaml mapping Piqi implements mapping of Piqi modules to OCaml modules and Piqi type definitions to OCaml type definitions using `piqic` -- Piq interface compiler. `piqic ocaml` command takes a Piqi module and produces `.ml` file containing: - OCaml module which corresponds to the source Piqi module; - OCaml type definitions; - functions for serializing and de-serializing OCaml values. Piqi types are naturally mapped to OCaml types: - bool, string, float are mapped to correspondent OCaml types - various integer types are mapped to OCaml's int, int32 or int64 - binaries are mapped to strings - variants are mapped to OCaml's polymorphic variants - records are mapped to OCaml's records where each OCaml record is put it its own recursive sub-module. Such mapping method provides a workaround for OCaml's flat record label namespace and it is actually very convenient when used with special CamlP4 macros. Piqi allows to add custom type mappings. For example, it is easy to add support for OCaml's nativeint, bigint or any other OCaml type (not sure about parametric types, though). To get a feeling how it all looks like, here is a couple of examples: - Piqi type specification written in Piqi (i.e. self-specification): http://github.com/alavrik/piqi/blob/master/piqi.org/piqtype.piqi - This is how the above specification is mapped to OCaml (NOTE: this is a hand-written spec which is used during the bootstrapping stage, the real one can be produced by running `piqic ocaml`): http://github.com/alavrik/piqi/blob/master/piqicc/boot/piqtype.ml.m4 - Piqi self-specification (and Piqi intermediate language) represented as OCaml data structures: http://github.com/alavrik/piqi/blob/master/piqicc/boot/piqdefs.ml 4. Google Protocol Buffers compatibility and mapping Piqi is type- and binary-compatible with Protocol Buffers: - Piqi modules and types (defined in `.piqi` files) can be converted to Google Protocol Buffers type specifications (`.proto` files), and the other way around. - Piqi uses the same binary encoding as the one used by Protocol Buffers. 5. Some implementation details One of interesting Piqi properties is that Piqi language implementation takes its own high-level specification written in Piqi, and parses the language into OCaml intermediate representation without any hand-written parsing rules. This mechanism allows easy extension of Piqi language. When adding new features, there is no need to design new syntax elements, update parsing code and transform AST into intermediate language. Also, new extensions are typically transparent for the core Piqi implementation. 6. Project status Piqi is a late-stage prototype. The implementation is fully functional and most of its functionality is utilized inside the project because of Piqi's bootstrapped architecture. OCaml mapping -- both generated code and the runtime library -- hasn't been optimized for performance. There's plenty of room for optimization, it just hasn't been a priority. Piqi has been tested with OCaml 3.11 on i368 Squeeze and amd64 Lenny Debian Linux. I haven't tried building it on Windows yet. 7. Acknowledgements I would like to express my great appreciation to the OCaml language authors and the OCaml community. It would be extremely hard, if possible at all, to design and implement Piqi as a hobby project using any other programming languages and tools I'm aware of. Piqi implementation relies on the following excellent software components: - OCaml and Camlp4 -- http://caml.inria.fr - ulex -- Unicode lexer generator, http://alain.frisch.fr/soft.html - easy-format -- Pretty-printing library, http://martin.jambon.free.fr/easy-format.html - OCamlMakefile -- Automated compilation of complex OCaml-projects, http://www.ocaml.info/home/ocaml_sources.html - open-in -- Camlp4 syntax extension, http://alain.frisch.fr/soft.html Piqi source code is available on github: http://github.com/alavrik/piqi For more information, examples and documentation please visit http://piqi.org I will be happy to answer OCaml-related questions in this topic branch. For all other questions, suggestions and comments there is the Piqi Google group: http://groups.google.com/group/piqi Your feedback is highly appreciated! -- Anton