From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 64029BC68 for ; Wed, 15 Nov 2006 14:50:45 +0100 (CET) Received: from imag.imag.fr (imag.imag.fr [129.88.30.1]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id kAFDoin9010293 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Wed, 15 Nov 2006 14:50:45 +0100 Received: from [147.171.255.214] (mac-clerc.imag.fr [147.171.255.214]) by imag.imag.fr (8.13.6/8.13.6) with ESMTP id kAFDoRl5006926 for ; Wed, 15 Nov 2006 14:50:27 +0100 (CET) Resent-Message-Id: <162EDBC2-E057-4477-AA5F-6E5349AA5D54@free.fr> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Resent-Date: Wed, 15 Nov 2006 14:52:50 +0100 Message-Id: Content-Transfer-Encoding: quoted-printable Resent-To: caml-list@yquem.inria.fr From: Xavier Clerc Subject: Re: [Caml-list] Bytecode object files structure Resent-From: Xavier Clerc Date: Wed, 15 Nov 2006 14:41:28 +0100 To: Pierre-Etienne Meunier X-Mailer: Apple Mail (2.752.2) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (imag.imag.fr [129.88.30.1]); Wed, 15 Nov 2006 14:50:27 +0100 (CET) X-IMAG-MailScanner-Information: Please contact IMAG DMI for more information X-IMAG-MailScanner: Found to be clean X-IMAG-MailScanner-SpamCheck: X-IMAG-MailScanner-From: xcforum@free.fr X-j-chkmail-Score: MSGID : 455B1B34.001 on discorde : j-chkmail score : X : 0/20 1 X-Miltered: at discorde with ID 455B1B34.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; bytecode:01 compiler:01 bytecode:01 compiler:01 ocaml:01 runtime:01 cmo:01 cmi:01 ocaml:01 marshalling:01 extern:01 byterun:01 extern:01 compilation:01 cmo:01 Le 13 nov. 06 =E0 16:50, Pierre-Etienne Meunier a =E9crit : > Hello, > > I'd like to write an assembler, to be able to understand how the vm =20= > really > works. I've to work on this for a school project (a compiler, I =20 > want it to > output caml bytecode object files). If you are working on a compiler that should output files to be =20 executed by the ocaml runtime, it does not seem necessary to handle =20 cmo/cmi files as the format of bytecode file should be sufficient to =20 code your compiler. Unless you have to link with ocaml modules. > I've understood that the data part, after the code itself, was =20 > generated using > output_value (I didn't know this function before). This fonction is used by the Marshal module. It transforms any non-=20 abstract value into a chain of bytes. The format of marshalling can be understood from the extern_rec =20 function of the byterun/extern.c file. > What I don't get now are > the cu_reloc, cu_primitives and cu_imports fields of the =20 > compilation_unit > type. You should remember that cmo files are parts that will be put =20 together (linked) in order to create a bytecode file. Given this context : - cu_imports lists the name of imported (used) modules the = current =20 cmo should be linked with in order to produce a bytecode file (the =20 digest of the imported modules is also kept to ensure that you link =20 with the same version you compiled against) ; - cu_primitives lists the primitives declared by the current = module =20 (each 'external f : type1 -> type2 =3D "primitive" ' will result in a =20= "primitive" entry of this list), needed to ensure that all required C =20= primitives are provided ; - cu_reloc : as each module is compiled independently, it can =20= declare some elements (e.g. global variables) and use them using a 0-=20 based index ; thus, when you link several modules together, you have =20 to relocate this information to ensure that the first module uses =20 indexes from 0 to n, the second module uses indexes from n+1 to n+m =20 and so on ... Hope this helps, Xavier Clerc PS : I am working on some documents describing marshalling format, =20 bytecode files as well as instruction opcodes. I will hopefully release them before xmas but don't hold your breath =20 as I don't have much spare time these days. In the meantime, you can contact me off-list for any related question. > > If you can help on this, > Thanks > P.E. Meunier > > On Monday 13 November 2006 11:53, you wrote: >> Hello, >> >> As I read a substancial part of the ocaml source code, I may help you >> understanding file formats. >> Could you be more precise about what you are particularly interested >> in : >> - file type : bytecode file, cmo file, cmi file ? >> - code or data section of these files ? >> >> May I also ask you what you are trying to do using these elements ? >> >> >> Cordially, >> >> Xavier Clerc >> >> Le 12 nov. 06 =E0 15:42, Pierre-Etienne Meunier a =E9crit : >>> Hi, >>> >>> I'm trying to decrypt .cmo files produced by simple programs, =20 >>> such as >>> 1+1;; >>> or >>> print_string "string";; >>> or >>> List.length [1;2;3;4;5];; >>> >>> According to the source of Ocaml, there's something called the >>> "cmo_magic_number", systematically written at the beginning of >>> all .cmo >>> files. Does it have a real function for executing the programs, or >>> is it just >>> a way to make sure the file contains ocaml bytecode ? >>> >>> Then, there's the address of what seems to be the last bytecode >>> instruction. >>> Then, the bytecode instructions, as documented in opcodes.ml. >>> >>> After that, I can't understand anything : there vaguely seems to be >>> some >>> information related to linking or so... What is the precise >>> structure of this >>> part ? Is there some kind of a bytecode assembler ? >>> >>> Thanks, >>> P.E. Meunier (pierreetienne.meunier@ens-lyon.fr) >>> >>> _______________________________________________ >>> Caml-list mailing list. Subscription management: >>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list >>> Archives: http://caml.inria.fr >>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >>> Bug reports: http://caml.inria.fr/bin/caml-bugs >