From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA00529 for caml-red; Wed, 9 Aug 2000 23:57:41 +0200 (MET DST)
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA29913 for <caml-list@pauillac.inria.fr>; Wed, 9 Aug 2000 19:45:22 +0200 (MET DST)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by concorde.inria.fr (8.10.0/8.10.0) with ESMTP id e79HjJD17388;
	Wed, 9 Aug 2000 19:45:20 +0200 (MET DST)
Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA29499; Wed, 9 Aug 2000 19:45:19 +0200 (MET DST)
Message-ID: <20000809194519.27296@pauillac.inria.fr>
Date: Wed, 9 Aug 2000 19:45:19 +0200
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: Georges Mariano <georges.mariano@inrets.fr>,
        Markus Mottl <mottl@miss.wu-wien.ac.at>,
        "caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: tiny toplevel
References: <85256935.0059D0CD.00@D51MTA04.pok.ibm.com> <20000809132941.A16537@miss.wu-wien.ac.at> <39914EAB.6506B2F5@inrets.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.89.1
In-Reply-To: <39914EAB.6506B2F5@inrets.fr>; from Georges Mariano on Wed, Aug 09, 2000 at 12:29:31PM +0000
Sender: weis@pauillac.inria.fr

> Statitics are good but your conclusion is wrong because
> who said that "embedded" interpreters are "standard" interpreters ??

Well, in your initial message, it wasn't clear what you meant by
"embedded interpreter":

1- This can mean an interpreter for a scripting language embedded in a
larger application, e.g. TCL, Perl or PHP in Apache, Basic in MS
Office, etc.

2- This can also mean a bytecode interpreter and runtime system running
in an embedded computer.

The requirements are quite different in both cases.  For instance,
scripts are generally presented as source code, so, for OCaml, 1-
corresponds roughly to the OCaml toplevel (ocaml).  On the other hand,
embedded computers generally run binary code or object code, so 2-
corresponds only the the OCaml bytecode interpreter and runtime system
(ocamlrun).

> Obviously this is not the case, and taking Java as an example is
> also wrong because "embedded JAVA" is not JAVA but somthing close
> to JavaCard (in the SmartCards **specific** context), 
> so different constraints, specifications, and language
>  
> Suppose that you are able to define a JAVA language subset
> wich is small enough to be embedded in, say, smartcards,
> but in the same time, you're not able to define the same subset for Ocaml
> (recall, it's a supposition!! :-)
> => you can't have OScard (Ocaml for Smart Cards :-)
> despite the comparison we made on "initial" interpreters...

I'm glad you mentioned smartcarts and JavaCard, because I have
extensive, first-hand experience with these.  There's a lot to be said
here, but I'll try to focus on a few key points.

First, the main issue with smart cards (and to a lesser extent with
other embedded systems) isn't code size, but data size.  Smart cards
usually come with a lot more ROM (for storing the bytecode interpreter
and runtime system) than EEPROM (nonvolatile memory for storing
bytecode programs and persistent data), and ridiculously small amounts
of RAM (128 bytes to 1 Kbyte) for storing the stack and short-lived data.

It's no big deal to fit a JavaCard virtual machine in the ROM space
(been there, done that), and it wouldn't be hard to fit a Caml virtual
machine either.  The real issues are compactness of bytecode, heap
management, and stack management.  For instance, JavaCard allocates
all its objects in EEPROM, and doesn't have GC *at all*; this means
applications allocate few objects, and only at program initialization
time; the program should then run in constant space, updating in-place
the preallocated data.  The small RAM size also means that the stack
must stay very small, prohibiting any kind of recursive programming.

So, even though JavaCard looks like a pretty large subset of Java, the
programming style of JavaCard developers is radically different from
that of Java programmers, and resembles a lot traditional embedded C style
(i.e. no dynamic allocation at all).  In my opinion, the smart
card market would have been a lot better with some kind of safe C, or
Pascal, or Forth, rather than Java; but they needed a bytecode
interpreter and type safety, so of course they turned to Java.

Now, could you do the same with Caml?  Almost -- maybe with a
different implementation strategy for closures, so that a function
definition doesn't systematically allocate if it is closed.  But the
subset of the language in which you'd have to program would be ugly
and most of the advantages of Caml would be lost.

The situation is quite different for other embedded devices,
e.g. mobile phones, debit card machines at gas stations, electronic
fuel injection systems, etc.  These are quite respectable computers,
with 16 or 32-bit processors (Intel 8086, Motorola 68000, Motorola
PPC), and memory in the 100k-1M range.  In the early 90's, Caml Light
ran on machines no more powerful than this (640k PC, 1M Mac Plus, not
to mention the 64k partitions of the Palm Pilot in the late 90's).

Granted, OCaml has moved away from this kind of platform and towards
modern desktop machines.  For instance, the OCaml bytecode is 3.5
times bigger than the Caml Light bytecode, in exchange for faster
interpretation.  Going the other way would not be a big problem; it's
just that there is no need yet.

> If I understand P. Weis, one thing is to remove Object Programming from
> OCaml, then you have something close to CamlLight toplevel, ok.

In defense of OCaml's objects, they add very little complexity to the
bytecode interpreter and runtime system: *one* bytecode instruction,
and a bytecode library.  They do add a lot of complexity to the
typechecker and compiler, though.

> In the context of an embedded system you may remove I/O filesystem
> functions ?? (I don't know exactly what is an embedded system...)
> and what else ??

You'd probably remove the filesystem functions, yes, and replace them
with I/O functions tailored to what's connected to the embedded system
(e.g. serial port on a smart card; small keyboard and small screen on
a debit card machine; etc).  Also, you could remove everything that
has to do with dynamically loading the bytecode from a file; the
bytecode is usually (but not always) already there in your memory
space.  A lot of portability crutf can go away, because you know
exactly the machine.  E.g. don't worry about malloc() returning
non-contiguous memory chunks; just use your memory allocator.  Etc, etc.

On the other hand, there are things that the OS does for you on a
desktop computer that you may have to do by hand: interrupt handling,
talking directly to the peripherals, dealing with unreliable EEPROM
writes, etc.

It's quite a different world from programming desktop systems...

- Xavier Leroy