From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id PAA06248; Tue, 5 Feb 2002 15:19:30 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id PAA07160 for ; Tue, 5 Feb 2002 15:19:29 +0100 (MET) Received: from HUET-GERA1P (huet-gera2p.inria.fr [128.93.20.113]) by concorde.inria.fr (8.11.1/8.11.1) with SMTP id g15EJSH21401 for ; Tue, 5 Feb 2002 15:19:28 +0100 (MET) Message-Id: <200202051419.g15EJSH21401@concorde.inria.fr> X-Sender: huet@pop-rocq.inria.fr X-Mailer: QUALCOMM Windows Eudora Pro Version 4.0.2 Date: Tue, 05 Feb 2002 15:18:57 +0100 To: caml-list@inria.fr From: Gerard Huet Subject: [Caml-list] Syntax Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk At 13:05 05/02/02 +0100, Daniel de Rauglaudre wrote:=20 >Hi,=20 >=20 >On Tue, Feb 05, 2002 at 12:53:39PM +0100, Remi VANICAT wrote:=20 >=20 >> So the fact that a library or a tool is written in revised or=20 >> standard library doesn't change anything for those who use it. The=20 >> only problem is for those who want to change the source, but it's=20 >> not so hard to learn this syntax if you want to use it.=20 >=20 >Exactly. Yessir.=20 It is trivial to switch to revised syntax. It will take you one week to=20 get used to True/False vs true/false, to use square brackets in [x :: l],=20 and to write (list int) instead of (int list). I never missed the double=20 semicolon, and many problems will ill-parenthesised matches just went into= =20 oblivion. What I missed most is that you can't anymore step through the=20 toplevel with mouse cut-and-paste, because the inner let's are not accepted= =20 at toplevel, you have to write the godamn "value" keyword instead. Until recently there was no point in pushing the revised syntax, because it= =20 was very hard to teach the language when the system's printer used another= =20 syntax than what you typed in. Now that there is smooth integration of=20 camlp4 with ocaml with the 3.04, there is no excuse not to use the much=20 superior revised syntax, in my opinion. Which does not mean that there=20 should be a big concerted effort to switch all our code from one syntax to= =20 the next.=20 I am currently developing a computational linguistics platform in ocaml, now around 12000 loc in about 50 modules, mostly in revised syntax, some others= =20 borrowed from other sources or mechanically produced and still in vanila=20 syntax, so what. The makefile does what is needed, and I never have any=20 problem reading other people's programs from libraries or the hump, etc.=20 The crucial point is that we need good tutorials, reference manuals, and books=20 in the revised syntax before being serious about "standardizing" in= something=20 else than the usual syntax. Once this material exists, then we can talk.=20 Let me tell you about an experience I did recently, in using Ocaml as a=20 publication language. In the first version of my paper, I said "we shall=20 use as algorithmic meta-language OCaml under so-called revised syntax".=20 Now one referee got very interested in the code, tried it on his machine with ocaml 3.02, missed my remark about revised syntax, and said "it is too bad these are not real programs which people can directly use". Another referee got completely turned off by the code and said "remove all proselytism about this weird Ocaml stuff, and use some pidgin algorithmic language". Typical dilemna.=20 What I ended up was writing a short presentation of an algorithmic=20 meta-language which I called "Pidgin ML", without any proselytism about=20 existing programming languages; it is only in the evaluation part of the=20 paper that I reveal that Pidgin ML can be compiled and executed as such by OCaml+Camlp4. So everybody is happy ! Just as a short plug for revised syntax, here is my introduction: ____________________________________ We shall use as {\sl meta language} for the description of our algorithms=20 a pidgin version of the functional language ML. Readers familiar with ML=20 may skip this section, which gives a crash overview of its syntax and=20 semantics. The core language has types, values, and exceptions.=20 Thus, \verb:1: is a value of predefined type \verb:int:, whereas=20 \verb:"CL": is a \verb:string:.=20 Pairs of values inhabit the corresponding product type. Thus:=20 \verb|(1,"CL") : (nat * string)|.=20 Recursive type declarations create new types,=20 whose values are inductively built from the associated constructors.=20 Thus the Boolean type could be declared as a sum by:=20 \verb:type bool =3D [True | False];:\\=20 Parametric types give rise to polymorphism.=20 Thus if \verb:x: is of type \verb:t: and \verb:l: is of type=20 \verb:(list t):, we construct the list adding \verb:x: to \verb:l:=20 as \verb|[x :: l]|. The empty list is \verb:[]:, of (polymorphic) type=20 \verb:(list 'a):. Although the language is strongly typed, explicit type=20 specification is rarely needed from the designer, since principal types=20 may be inferred mechanically. The language is functional in the sense that functions are first class=20 objects. Thus the doubling integer function may be written as=20 \verb:fun x -> x+x:, and it has type \verb:int -> int:. It may be associated= =20 to the name \verb:double: by declaring: \verb:value double =3D fun x ->= x+x;:\\=20 Equivalently we could write: \verb:value double x =3D x+x;:\\=20 Its application to value \verb:n: is written as \verb:(double n): or even=20 \verb:double n: when there is no ambiguity. Application associates to the=20 left, and thus \verb:f x y: stands for \verb:((f x) y):.=20 Recursive functional values are declared with the keyword \verb:rec:.=20 Thus we may define the factorial function as:\\=20 \verb:value rec fact n =3D n*(fact (n-1));:\\=20 Functions may be defined by pattern matching. Thus the first projection of= =20 pairs could be defined by:\\=20 \verb:value fst =3D fun [ (x,y) -> x ];:\\=20 or equivalently (since there is only one pattern in this case) by:\\=20 \verb:value fst (x,y) =3D x;:\\=20 Pattern-matching is also usable in \verb:match: expressions which generalise= =20 case analysis,=20 such as: \verb:match l with [ [] -> True | _ -> False ]:, which=20 tests whether list \verb:l: is empty, using underscore as catch-all=20 pattern. Evaluation is strict, which means that \verb:x: is evaluated before=20 \verb:f: in the evaluation of \verb:(f x):. The \verb:let: expressions=20 permit to sequentialise computation, and to share sub-computations. Thus=20 \verb:let x =3D fact 10 in x+x: will compute \verb:fact 10: first,=20 and only once.=20 An equivalent postfix \verb:where: notation may be used as well. Thus=20 the conditional expression \verb:if b then e1 else e2: is equivalent to:=20 \verb:choose b where choose =3D fun [ True -> e1 | False -> e2]:. Exceptions are declared with the type of their parameters, like in:=20 \verb:exception Failure of string;: An exceptional value may be raised, like in:=20 \verb:raise (Failure "div 0"): and handled by a \verb:try: switching on=20 exception patterns, such as: \verb:try expression with [ Failure s -> ... ]:. Other imperative constructs may be used, such as=20 references, mutable arrays, while loops and I/O commands,=20 but we shall seldom need them. Sequences of instructions are=20 evaluated in left to right regime in \verb:do: expressions, such as:=20 \verb:do {e1; ... en}:.=20 ML is a {\sl modular} language, in the sense that sequences of type, value= =20 and exception declarations may be packed in a structural unit called a=20 \verb:module:, amenable to separate treatment.=20 Modules have types themselves, called {\sl signatures}. Parametric=20 modules are called {\sl functors}. The algorithms presented in this paper=20 will use in essential ways=20 this modularity structure, but the syntax ought to be self-evident. Readers uninterested in computational details may think of ML=20 definitions as recursive equations over inductively defined algebras. Most= =20 of them are simple primitive recursive functionals.=20 ___________________________________________________ At this point, note that Pidgin ML has no objects (not even records!) nor labels. And later on I spill the beans: Pidgin ML definitions may actually be directly executed as Objective Caml=20 programs \cite{ocaml}, under the so-called revised syntax \cite{camlp4}.=20 >> By the way, is there any caml-mode for Emacs and the revised syntax ?=20 >=20 >I don't think so but there is a request for that (G=E9rard Huet asked=20 >me yesterday). I use a very old caml-light-or-what emacs mode and I=20 >am too lazy to look at emacs-lisp to create my mode. This is an important point. I myself use a slighly hacked version of Tuareg= =20 as an interim solution. I shall have a look at otags, which, being built=20 with camlp4 support, ought to be parametrizable (?). I suggest this syntax problem should be seriously considered, but as a=20 long-term effort, encompassing development tools and documentation and=20 training material. This is not a battle that can be won by one round of=20 email flame. Gerard ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr