From: Xavier Clerc <xcforum@free.fr>
To: Pierre-Etienne Meunier <pierreetienne.meunier@ens-lyon.fr>
Subject: Re: [Caml-list] Bytecode object files structure
Date: Wed, 15 Nov 2006 14:41:28 +0100 [thread overview]
Message-ID: <F47BC584-F785-4C62-9F12-B544AF865927@free.fr> (raw)
Le 13 nov. 06 à 16:50, Pierre-Etienne Meunier a écrit :
> Hello,
>
> I'd like to write an assembler, to be able to understand how the vm
> really
> works. I've to work on this for a school project (a compiler, I
> want it to
> output caml bytecode object files).
If you are working on a compiler that should output files to be
executed by the ocaml runtime, it does not seem necessary to handle
cmo/cmi files as the format of bytecode file should be sufficient to
code your compiler. Unless you have to link with ocaml modules.
> I've understood that the data part, after the code itself, was
> generated using
> output_value (I didn't know this function before).
This fonction is used by the Marshal module. It transforms any non-
abstract value into a chain of bytes.
The format of marshalling can be understood from the extern_rec
function of the byterun/extern.c file.
> What I don't get now are
> the cu_reloc, cu_primitives and cu_imports fields of the
> compilation_unit
> type.
You should remember that cmo files are parts that will be put
together (linked) in order to create a bytecode file.
Given this context :
- cu_imports lists the name of imported (used) modules the current
cmo should be linked with in order to produce a bytecode file (the
digest of the imported modules is also kept to ensure that you link
with the same version you compiled against) ;
- cu_primitives lists the primitives declared by the current module
(each 'external f : type1 -> type2 = "primitive" ' will result in a
"primitive" entry of this list), needed to ensure that all required C
primitives are provided ;
- cu_reloc : as each module is compiled independently, it can
declare some elements (e.g. global variables) and use them using a 0-
based index ; thus, when you link several modules together, you have
to relocate this information to ensure that the first module uses
indexes from 0 to n, the second module uses indexes from n+1 to n+m
and so on ...
Hope this helps,
Xavier Clerc
PS : I am working on some documents describing marshalling format,
bytecode files as well as instruction opcodes.
I will hopefully release them before xmas but don't hold your breath
as I don't have much spare time these days.
In the meantime, you can contact me off-list for any related question.
>
> If you can help on this,
> Thanks
> P.E. Meunier
>
> On Monday 13 November 2006 11:53, you wrote:
>> Hello,
>>
>> As I read a substancial part of the ocaml source code, I may help you
>> understanding file formats.
>> Could you be more precise about what you are particularly interested
>> in :
>> - file type : bytecode file, cmo file, cmi file ?
>> - code or data section of these files ?
>>
>> May I also ask you what you are trying to do using these elements ?
>>
>>
>> Cordially,
>>
>> Xavier Clerc
>>
>> Le 12 nov. 06 à 15:42, Pierre-Etienne Meunier a écrit :
>>> Hi,
>>>
>>> I'm trying to decrypt .cmo files produced by simple programs,
>>> such as
>>> 1+1;;
>>> or
>>> print_string "string";;
>>> or
>>> List.length [1;2;3;4;5];;
>>>
>>> According to the source of Ocaml, there's something called the
>>> "cmo_magic_number", systematically written at the beginning of
>>> all .cmo
>>> files. Does it have a real function for executing the programs, or
>>> is it just
>>> a way to make sure the file contains ocaml bytecode ?
>>>
>>> Then, there's the address of what seems to be the last bytecode
>>> instruction.
>>> Then, the bytecode instructions, as documented in opcodes.ml.
>>>
>>> After that, I can't understand anything : there vaguely seems to be
>>> some
>>> information related to linking or so... What is the precise
>>> structure of this
>>> part ? Is there some kind of a bytecode assembler ?
>>>
>>> Thanks,
>>> P.E. Meunier (pierreetienne.meunier@ens-lyon.fr)
>>>
>>> _______________________________________________
>>> Caml-list mailing list. Subscription management:
>>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>>> Archives: http://caml.inria.fr
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
next reply other threads:[~2006-11-15 13:50 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-15 13:41 Xavier Clerc [this message]
-- strict thread matches above, loose matches on Subject: below --
2006-11-12 14:42 Pierre-Etienne Meunier
2006-11-12 14:56 ` [Caml-list] " Alain Frisch
2006-11-13 9:16 ` Yann Régis-Gianas
[not found] ` <968382EE-B8CB-452C-A86F-684879E33798@free.fr>
2006-11-13 11:36 ` Pierre-Etienne Meunier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=F47BC584-F785-4C62-9F12-B544AF865927@free.fr \
--to=xcforum@free.fr \
--cc=pierreetienne.meunier@ens-lyon.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox