From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA00529 for caml-red; Wed, 9 Aug 2000 23:57:41 +0200 (MET DST) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA29913 for ; Wed, 9 Aug 2000 19:45:22 +0200 (MET DST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.10.0/8.10.0) with ESMTP id e79HjJD17388; Wed, 9 Aug 2000 19:45:20 +0200 (MET DST) Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA29499; Wed, 9 Aug 2000 19:45:19 +0200 (MET DST) Message-ID: <20000809194519.27296@pauillac.inria.fr> Date: Wed, 9 Aug 2000 19:45:19 +0200 From: Xavier Leroy To: Georges Mariano , Markus Mottl , "caml-list@inria.fr" Subject: Re: tiny toplevel References: <85256935.0059D0CD.00@D51MTA04.pok.ibm.com> <20000809132941.A16537@miss.wu-wien.ac.at> <39914EAB.6506B2F5@inrets.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.89.1 In-Reply-To: <39914EAB.6506B2F5@inrets.fr>; from Georges Mariano on Wed, Aug 09, 2000 at 12:29:31PM +0000 Sender: weis@pauillac.inria.fr > Statitics are good but your conclusion is wrong because > who said that "embedded" interpreters are "standard" interpreters ?? Well, in your initial message, it wasn't clear what you meant by "embedded interpreter": 1- This can mean an interpreter for a scripting language embedded in a larger application, e.g. TCL, Perl or PHP in Apache, Basic in MS Office, etc. 2- This can also mean a bytecode interpreter and runtime system running in an embedded computer. The requirements are quite different in both cases. For instance, scripts are generally presented as source code, so, for OCaml, 1- corresponds roughly to the OCaml toplevel (ocaml). On the other hand, embedded computers generally run binary code or object code, so 2- corresponds only the the OCaml bytecode interpreter and runtime system (ocamlrun). > Obviously this is not the case, and taking Java as an example is > also wrong because "embedded JAVA" is not JAVA but somthing close > to JavaCard (in the SmartCards **specific** context), > so different constraints, specifications, and language > > Suppose that you are able to define a JAVA language subset > wich is small enough to be embedded in, say, smartcards, > but in the same time, you're not able to define the same subset for Ocaml > (recall, it's a supposition!! :-) > => you can't have OScard (Ocaml for Smart Cards :-) > despite the comparison we made on "initial" interpreters... I'm glad you mentioned smartcarts and JavaCard, because I have extensive, first-hand experience with these. There's a lot to be said here, but I'll try to focus on a few key points. First, the main issue with smart cards (and to a lesser extent with other embedded systems) isn't code size, but data size. Smart cards usually come with a lot more ROM (for storing the bytecode interpreter and runtime system) than EEPROM (nonvolatile memory for storing bytecode programs and persistent data), and ridiculously small amounts of RAM (128 bytes to 1 Kbyte) for storing the stack and short-lived data. It's no big deal to fit a JavaCard virtual machine in the ROM space (been there, done that), and it wouldn't be hard to fit a Caml virtual machine either. The real issues are compactness of bytecode, heap management, and stack management. For instance, JavaCard allocates all its objects in EEPROM, and doesn't have GC *at all*; this means applications allocate few objects, and only at program initialization time; the program should then run in constant space, updating in-place the preallocated data. The small RAM size also means that the stack must stay very small, prohibiting any kind of recursive programming. So, even though JavaCard looks like a pretty large subset of Java, the programming style of JavaCard developers is radically different from that of Java programmers, and resembles a lot traditional embedded C style (i.e. no dynamic allocation at all). In my opinion, the smart card market would have been a lot better with some kind of safe C, or Pascal, or Forth, rather than Java; but they needed a bytecode interpreter and type safety, so of course they turned to Java. Now, could you do the same with Caml? Almost -- maybe with a different implementation strategy for closures, so that a function definition doesn't systematically allocate if it is closed. But the subset of the language in which you'd have to program would be ugly and most of the advantages of Caml would be lost. The situation is quite different for other embedded devices, e.g. mobile phones, debit card machines at gas stations, electronic fuel injection systems, etc. These are quite respectable computers, with 16 or 32-bit processors (Intel 8086, Motorola 68000, Motorola PPC), and memory in the 100k-1M range. In the early 90's, Caml Light ran on machines no more powerful than this (640k PC, 1M Mac Plus, not to mention the 64k partitions of the Palm Pilot in the late 90's). Granted, OCaml has moved away from this kind of platform and towards modern desktop machines. For instance, the OCaml bytecode is 3.5 times bigger than the Caml Light bytecode, in exchange for faster interpretation. Going the other way would not be a big problem; it's just that there is no need yet. > If I understand P. Weis, one thing is to remove Object Programming from > OCaml, then you have something close to CamlLight toplevel, ok. In defense of OCaml's objects, they add very little complexity to the bytecode interpreter and runtime system: *one* bytecode instruction, and a bytecode library. They do add a lot of complexity to the typechecker and compiler, though. > In the context of an embedded system you may remove I/O filesystem > functions ?? (I don't know exactly what is an embedded system...) > and what else ?? You'd probably remove the filesystem functions, yes, and replace them with I/O functions tailored to what's connected to the embedded system (e.g. serial port on a smart card; small keyboard and small screen on a debit card machine; etc). Also, you could remove everything that has to do with dynamically loading the bytecode from a file; the bytecode is usually (but not always) already there in your memory space. A lot of portability crutf can go away, because you know exactly the machine. E.g. don't worry about malloc() returning non-contiguous memory chunks; just use your memory allocator. Etc, etc. On the other hand, there are things that the OS does for you on a desktop computer that you may have to do by hand: interrupt handling, talking directly to the peripherals, dealing with unreliable EEPROM writes, etc. It's quite a different world from programming desktop systems... - Xavier Leroy