Re: [Caml-list] How to simplify an arithmetic expression ?

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

From: Ernesto Posse <eposse@cs.queensu.ca>
To: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] How to simplify an arithmetic expression ?
Date: Sun, 2 Oct 2011 11:09:55 -0400	[thread overview]
Message-ID: <CABY2naxBqwtD2NVzz+kUVOix-FJ44-uYkjX9+hph4D3yJBUAuQ@mail.gmail.com> (raw)
In-Reply-To: <CAPFanBE6RX4y_VQuaTJx40DJ=TFY9y_4PeLkfJ3d4Py=+X9Ckw@mail.gmail.com>

> On Sun, Oct 2, 2011 at 1:51 PM, Diego Olivier Fernandez Pons
> <dofp.ocaml@gmail.com> wrote:
>>     OCaml list,
>> It's easy to encapsulate a couple of arithmetic simplifications into a
>> function that applies them bottom up to an expression represented as a tree
>> let rec simplify = function
>>     | Plus (e1, e2) ->
>>         match (simplify e1, simplify e2) with
>>              | (Constant 0, _) -> e2
>> With a couple well known tricks like pushing constants to the left side and
>> so on...
>> How can I however guarantee that
>>     1. My final expression reaches a kind of minimal normal form
>>     2. My set of simplifications is optimal in the sense it doesn't traverse
>> subtrees without need
>> Here is my current simplifier and I have no way of telling if it really
>> simplifies the expressions as much as possible and if it does useless passes
>> or not
>> type expr =
>>     | Constant of float
>>     | Plus of expr * expr
>>     | Minus of expr * expr
>>     | Times of expr * expr
>>     | Variable of string
>> let rec normalForm = function
>>     | Minus (e1, e2) -> normalForm (Plus (normalForm e1, Times (Constant
>> (-1.0), normalForm e2)))
>>     | Plus (e1, e2) ->
>>         begin
>>         match (normalForm e1, normalForm e2) with
>>             | (Constant a, Constant b) -> Constant (a +. b)
>>             | (Constant 0.0, e) -> normalForm e
>>             | (e, Constant b) -> normalForm (Plus (Constant b, normalForm
>> e))
>>             | (Constant a, Plus (Constant b, e)) -> Plus (Constant (a +. b),
>> normalForm e)
>>             | (Plus (Constant a, e1), Plus (Constant b, e2)) -> Plus
>> (Constant (a +. b), normalForm (Plus (normalForm e1, normalForm e2)))
>>             | (Variable a, Variable b) when a = b -> Times (Constant 2.0,
>> Variable a)
>>             | (Times (Constant n, Variable b), Variable a) when a = b ->
>> Times (Constant (n +. 1.0), Variable a)
>>             | (Variable a, Times (Constant n, Variable b)) when a = b ->
>> Times (Constant (n +. 1.0), Variable a)
>>             | (Times (Constant n, Variable a), Times (Constant m, Variable
>> b)) when a = b -> Times (Constant (n +. m), Variable a)
>>             | other -> Plus (e1, e2)
>>         end
>>     | Times (e1, e2) ->
>>         begin
>>         match (normalForm e1, normalForm e2) with
>>             | (Constant a, Constant b) -> Constant (a *. b)
>>             | (Constant 0.0, e) -> Constant 0.0
>>             | (Constant 1.0, e) -> e
>>             | (e, Constant b) -> normalForm (Times (Constant b, normalForm
>> e))
>>             | (Constant a, Times (Constant b, e)) -> Times (Constant (a *.
>> b), e)
>>             | other -> Times (e1, e2)
>>          end
>>     | x -> x
>> let (++) = fun x y -> Plus (x, y)
>> let ( ** ) = fun x y -> Times (x, y)
>> let ( --) = fun x y -> Minus (x, y)
>> let f = function fl -> Constant fl
>> let x = Variable "x"
>> let h = fun i -> f i ** x -- f i ** f i ** x ++ (f 3.0 ++ f i)
>> let e = List.fold_left (fun t i -> Plus (t, h i)) (f 0.0) [1.0; 2.0; 3.0;
>> 4.0; 5.0]
>> normalForm e
>>
>>         Diego Olivier
>>


On Sun, Oct 2, 2011 at 10:08 AM, Gabriel Scherer
<gabriel.scherer@gmail.com> wrote:
> In my experience, the OCaml code doing recursive call and pattern
> matching is a relatively bad way to reason about such rewrite systems.

Why? In general, reduction in (pure) functional languages is
rewriting. Diego's function does not seem to have side-effects, so why
would this be a bad way?

As I understand it, this style is very much one of the reasons why ML
was designed this way. If we are talking about optimization, then yes,
there may be better ways of doing this, but if we are talking about
correctness, readability, and reasoning, then I don't see why this
style would be considered bad.

> Your questions are extremely pertinent, and relatively difficult to
> answer in general.
>
> For a start, I think your code indeed repeats useless traversals. This
> can be seen syntactically by the nesting of two normalForm calls, such
> as:
>
>  | (e, Constant b) -> normalForm (Plus (Constant b, normalForm e))
>
> You reduce e to a normal form, then repeat the reduction on some
> expression containing e. The outer call will surely re-traverse (the
> normal form of) e, which is useless here.

Another problem might be that the "rewriting" rules do not seem to be
exhaustive although I haven't checked very carefully.


> One approach I like for such simplifications is the "normalization by
> evaluation" approach. The idea is to define a different representation
> of normal forms of your system as "semantic values"; I mean a
> representation that has a meaning in itself and not just "what's left
> after this arbitrary transformation"; in your case, that could be
> multivariate polynomials (defined as an independent datatype). Then
> you express your normalization algorithm as an evaluation of your
> expression into semantic values; you can reify them back into the
> expression datatype, and if you did everything right you get normal
> forms (in particular, normalizing a reified value will return exactly
> this value). The main difficulty is to understand what are the normal
> forms you're looking for; then the code is relatively easy and can be
> made efficient.

I wonder if such an approach (normalization by evaluation) might be a
bit of an overkill. In general, whenever you have an algebraic
structure with normal forms, normal forms can be obtained by
equational reasoning: using the algebra's laws as rewriting rules. So
in principle at least, shouldn't Diego's problem be solvable this way,
without the need for a special semantic domain for normal forms? When
would the normalization by evaluation approach be preferable? Can you
show a small example?




-- 
Ernesto Posse

Modelling and Analysis in Software Engineering
School of Computing
Queen's University - Kingston, Ontario, Canada

next prev parent reply	other threads:[~2011-10-02 15:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-02 11:51 Diego Olivier Fernandez Pons
2011-10-02 14:08 ` Gabriel Scherer
2011-10-02 15:09   ` Ernesto Posse [this message]
2011-10-02 16:32     ` Xavier Leroy
2011-10-02 16:48       ` Gabriel Scherer
2011-10-02 17:07         ` Gabriel Scherer
2011-10-05 13:17           ` Goswin von Brederlow
2011-10-05 21:31             ` Gabriel Scherer
2011-10-02 16:56 ` Xavier Clerc
2011-10-05 13:09 ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABY2naxBqwtD2NVzz+kUVOix-FJ44-uYkjX9+hph4D3yJBUAuQ@mail.gmail.com \
    --to=eposse@cs.queensu.ca \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox