From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p92F9q8X021788 for ; Sun, 2 Oct 2011 17:09:56 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AssAAG1+iE7RVdy2mGdsb2JhbABBpX6CIAgiAQEBAQEICQ0HFCWBUwEBAQECARICExkBJRQDAQsBBQULDS4hARIBBQEcBhMih1mbAQqOKYNSiSsCBAaHGwSTYIoegnY9gUiCQQ X-IronPort-AV: E=Sophos;i="4.68,477,1312149600"; d="scan'208";a="122469594" Received: from mail-vx0-f182.google.com ([209.85.220.182]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 02 Oct 2011 17:09:56 +0200 Received: by vcbf13 with SMTP id f13so4182754vcb.27 for ; Sun, 02 Oct 2011 08:09:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=phAcuP7GaBQTUMs9enMpbNAcIVb0Z1FlejV/5uSRE7E=; b=gBPNotBjOMF99VdR9iE1GSnLu4TrsPMuqSzFFNJBS67u+8exF8IOnY7XfQOoNf7uXn Tf5NAaS+jZwNtGEyrc66732VE+DbQACCtxJZU5Q+SULolDUE85BzUwAhuBcTufTu3z0c O9ysbN/ZZygsw6E3Gy7Kr9PH9ls1MQJeF+VE8= MIME-Version: 1.0 Received: by 10.220.3.18 with SMTP id 18mr3921860vcl.151.1317568195379; Sun, 02 Oct 2011 08:09:55 -0700 (PDT) Sender: eposse@gmail.com Received: by 10.220.195.77 with HTTP; Sun, 2 Oct 2011 08:09:55 -0700 (PDT) In-Reply-To: References: Date: Sun, 2 Oct 2011 11:09:55 -0400 X-Google-Sender-Auth: DD9gajJ9-YYFnyP0DZ2Cu_CRFUs Message-ID: From: Ernesto Posse To: caml-list Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id p92F9q8X021788 Subject: Re: [Caml-list] How to simplify an arithmetic expression ? > On Sun, Oct 2, 2011 at 1:51 PM, Diego Olivier Fernandez Pons > wrote: >>     OCaml list, >> It's easy to encapsulate a couple of arithmetic simplifications into a >> function that applies them bottom up to an expression represented as a tree >> let rec simplify = function >>     | Plus (e1, e2) -> >>         match (simplify e1, simplify e2) with >>              | (Constant 0, _) -> e2 >> With a couple well known tricks like pushing constants to the left side and >> so on... >> How can I however guarantee that >>     1. My final expression reaches a kind of minimal normal form >>     2. My set of simplifications is optimal in the sense it doesn't traverse >> subtrees without need >> Here is my current simplifier and I have no way of telling if it really >> simplifies the expressions as much as possible and if it does useless passes >> or not >> type expr = >>     | Constant of float >>     | Plus of expr * expr >>     | Minus of expr * expr >>     | Times of expr * expr >>     | Variable of string >> let rec normalForm = function >>     | Minus (e1, e2) -> normalForm (Plus (normalForm e1, Times (Constant >> (-1.0), normalForm e2))) >>     | Plus (e1, e2) -> >>         begin >>         match (normalForm e1, normalForm e2) with >>             | (Constant a, Constant b) -> Constant (a +. b) >>             | (Constant 0.0, e) -> normalForm e >>             | (e, Constant b) -> normalForm (Plus (Constant b, normalForm >> e)) >>             | (Constant a, Plus (Constant b, e)) -> Plus (Constant (a +. b), >> normalForm e) >>             | (Plus (Constant a, e1), Plus (Constant b, e2)) -> Plus >> (Constant (a +. b), normalForm (Plus (normalForm e1, normalForm e2))) >>             | (Variable a, Variable b) when a = b -> Times (Constant 2.0, >> Variable a) >>             | (Times (Constant n, Variable b), Variable a) when a = b -> >> Times (Constant (n +. 1.0), Variable a) >>             | (Variable a, Times (Constant n, Variable b)) when a = b -> >> Times (Constant (n +. 1.0), Variable a) >>             | (Times (Constant n, Variable a), Times (Constant m, Variable >> b)) when a = b -> Times (Constant (n +. m), Variable a) >>             | other -> Plus (e1, e2) >>         end >>     | Times (e1, e2) -> >>         begin >>         match (normalForm e1, normalForm e2) with >>             | (Constant a, Constant b) -> Constant (a *. b) >>             | (Constant 0.0, e) -> Constant 0.0 >>             | (Constant 1.0, e) -> e >>             | (e, Constant b) -> normalForm (Times (Constant b, normalForm >> e)) >>             | (Constant a, Times (Constant b, e)) -> Times (Constant (a *. >> b), e) >>             | other -> Times (e1, e2) >>          end >>     | x -> x >> let (++) = fun x y -> Plus (x, y) >> let ( ** ) = fun x y -> Times (x, y) >> let ( --) = fun x y -> Minus (x, y) >> let f = function fl -> Constant fl >> let x = Variable "x" >> let h = fun i -> f i ** x -- f i ** f i ** x ++ (f 3.0 ++ f i) >> let e = List.fold_left (fun t i -> Plus (t, h i)) (f 0.0) [1.0; 2.0; 3.0; >> 4.0; 5.0] >> normalForm e >> >>         Diego Olivier >> On Sun, Oct 2, 2011 at 10:08 AM, Gabriel Scherer wrote: > In my experience, the OCaml code doing recursive call and pattern > matching is a relatively bad way to reason about such rewrite systems. Why? In general, reduction in (pure) functional languages is rewriting. Diego's function does not seem to have side-effects, so why would this be a bad way? As I understand it, this style is very much one of the reasons why ML was designed this way. If we are talking about optimization, then yes, there may be better ways of doing this, but if we are talking about correctness, readability, and reasoning, then I don't see why this style would be considered bad. > Your questions are extremely pertinent, and relatively difficult to > answer in general. > > For a start, I think your code indeed repeats useless traversals. This > can be seen syntactically by the nesting of two normalForm calls, such > as: > >  | (e, Constant b) -> normalForm (Plus (Constant b, normalForm e)) > > You reduce e to a normal form, then repeat the reduction on some > expression containing e. The outer call will surely re-traverse (the > normal form of) e, which is useless here. Another problem might be that the "rewriting" rules do not seem to be exhaustive although I haven't checked very carefully. > One approach I like for such simplifications is the "normalization by > evaluation" approach. The idea is to define a different representation > of normal forms of your system as "semantic values"; I mean a > representation that has a meaning in itself and not just "what's left > after this arbitrary transformation"; in your case, that could be > multivariate polynomials (defined as an independent datatype). Then > you express your normalization algorithm as an evaluation of your > expression into semantic values; you can reify them back into the > expression datatype, and if you did everything right you get normal > forms (in particular, normalizing a reified value will return exactly > this value). The main difficulty is to understand what are the normal > forms you're looking for; then the code is relatively easy and can be > made efficient. I wonder if such an approach (normalization by evaluation) might be a bit of an overkill. In general, whenever you have an algebraic structure with normal forms, normal forms can be obtained by equational reasoning: using the algebra's laws as rewriting rules. So in principle at least, shouldn't Diego's problem be solvable this way, without the need for a special semantic domain for normal forms? When would the normalization by evaluation approach be preferable? Can you show a small example? -- Ernesto Posse Modelling and Analysis in Software Engineering School of Computing Queen's University - Kingston, Ontario, Canada