From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id MAA01821 for caml-red; Mon, 26 Jun 2000 12:29:29 +0200 (MET DST)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA23724 for <caml-list@pauillac.inria.fr>; Sat, 24 Jun 2000 19:58:35 +0200 (MET DST)
Received: from isil.localdomain (adsl-151-201-234-16.bellatlantic.net [151.201.234.16])
	by nez-perce.inria.fr (8.10.0/8.10.0) with ESMTP id e5OHwU117394
	for <caml-list@inria.fr>; Sat, 24 Jun 2000 19:58:34 +0200 (MET DST)
Received: by isil.localdomain (Postfix, from userid 1000)
	id 43A10369DD; Sat, 24 Jun 2000 14:03:11 -0400 (EDT)
To: Markus Mottl <mottl@miss.wu-wien.ac.at>
Cc: caml-list@inria.fr
Subject: Re: Labels and operators
References: <87bt0y71f9.fsf@isil.localdomain>
	<20000624164420.B5164@miss.wu-wien.ac.at>
From: John Prevost <prevost@maya.com>
Date: 24 Jun 2000 14:03:10 -0400
In-Reply-To: Markus Mottl's message of "Sat, 24 Jun 2000 16:44:20 +0200"
Message-ID: <87zoob7y8x.fsf@isil.localdomain>
User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: weis@pauillac.inria.fr

>>>>> "mm" == Markus Mottl <mottl@miss.wu-wien.ac.at> writes:

    mm> Well, sometimes we are really struck by the only
    mm> "two-dimensional" way in which we can write our sources
    mm> (top-down + left-right). If we could write into the depth,
    mm> there would be an elegant solution for adding arguments to
    mm> infix operators...

    >> The only solution I can think of is something like:
    mm> [snip]
    >> # "foo" =~ re "f";; - : bool = true # "foo" =~ re "f" ~pos:1;;
    >> - : bool = false
    >> 
    >> Which, well, works, but seems kind of nasty.

    mm> I normally try to avoid new operators, but if I wanted to have
    mm> a somewhat "powered up" version of "=~", your version here
    mm> would look fine to me - just read every piece aloud:

    mm>   "foo" =~ re "f" ~pos:1 "foo" - is matched by - regular
    mm> expression - "f" - at position one

    mm> This is pretty close to the human way of expressing
    mm> things. (Larry Wall, the linguist, would be proud of you! ;-)

Honestly, lingustics is a hobby of mine, and Larry (whom I've met)
doesn't impress me much.  ;>

    >> since the labelled args could not in any way shape or form be
    >> thought to go with either expr1 or expr2.  This would lead to
    >> things like:
    >> 
    >> # "foo" =~ ~pos:1 "f";; - : bool = false
    >> 
    >> being possible.  Don't know whether it's a great idea, though.

    mm> I prefer your first version: "subject", "verb" and "object"
    mm> are close together, the additional modifiers only follow
    mm> afterwards. To my knowledge, most natural languages would
    mm> order expressions like this.

This is a good point, which I'd forgotten.  Natural languages have the
habit of ordering things so that the main arguments of a word
(whatever sort of word that may be) are always closer to the word than
any optional hangers on.  Essentially, this is to avoid problems like:

I gave the mayor of the northernmost city of the northernmost province
of the northernmost nation a token.

If you draw a picture of the structure of this sentence, you end up
with a big deep tree in between the verb "gave" and "a token", which
is the primary argument of that verb.  If you shift things around a
little, you don't get the problem:

I gave a token to the mayor of ...

So your assertion that most natural languages would order expressions
this way is well founded, unless you happen to know that R->L
languages are actually statistically more common than L->R languages
in the world, and therefore we should be writing:

~pos:1 "f" re =~ "foo";

;>

Anyway, the current syntax is enough.  =~ re ... also means that you
could have =~ other things, which might be useful.  And, if I really
cared, camlp4 could give me the full horror of things like:

"foo" =~ /f/i

John.