From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jon@ffconsultancy.com>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled 
	version=3.1.3
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id 69C17BC6B
	for <caml-list@yquem.inria.fr>; Thu, 28 Jun 2007 14:14:02 +0200 (CEST)
Received: from pih-relay04.plus.net (pih-relay04.plus.net [212.159.14.131])
	by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id l5SCE1x1014122
	(version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO)
	for <caml-list@yquem.inria.fr>; Thu, 28 Jun 2007 14:14:02 +0200
Received: from [80.229.56.224] (helo=beast.local)
	 by pih-relay04.plus.net with esmtp (Exim) id 1I3ssp-0005QJ-1x
	for caml-list@yquem.inria.fr; Thu, 28 Jun 2007 13:13:59 +0100
From: Jon Harrop <jon@ffconsultancy.com>
Organization: Flying Frog Consultancy Ltd.
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] The Implicit Accumulator: a design pattern using optional arguments
Date: Thu, 28 Jun 2007 13:08:13 +0100
User-Agent: KMail/1.9.7
References: <200706271314.35134.jon@ffconsultancy.com> <200706281232.01643.jon@ffconsultancy.com> <58DA3107-BFD2-4ADF-A903-2CB63C6D29C2@gmail.com>
In-Reply-To: <58DA3107-BFD2-4ADF-A903-2CB63C6D29C2@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200706281308.13958.jon@ffconsultancy.com>
X-Miltered: at concorde with ID 4683A609.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)!
X-Spam: no; 0.00; rax:01 rax:01 globl:01 globl:01 ocaml:01 ocaml:01 $1,:98 $1,:98 frog:98 wrote:01 caml-list:01 nums:01 nums:01 reorder:02 match:02 

On Thursday 28 June 2007 12:42:57 Joel Reymont wrote:
> Where does the 65% speed-up come from?

Good question.

> Just from using match?

Yes, or you can reorder the branches of the "if" statement, putting the common 
branch first.

My code gives:

camlTest__work_58:
.L101:
        cmpq    $1, %rbx
        je      .L100
        movq    %rbx, %rdi
        addq    $-2, %rdi
        leaq    -1(%rax, %rbx), %rax
        movq    %rdi, %rbx
        jmp     .L101
        .align  4
.L100:
        ret
        .text
        .align  16
        .globl  camlTest__sum_nums3_61
camlTest__sum_nums3_61:
.L102:
        movq    %rax, %rbx
        movq    $1, %rax
        jmp     camlTest__work_58
        .text
        .align  16
        .globl  camlTest__entry

So it branches out of the loop when todo=0 and does one branch per loop.

Both of Thomas' implementations give:

camlTest__work_60:
.L101:
        cmpq    $1, %rbx
        jne     .L100
        ret
        .align  4
.L100:
        movq    %rbx, %rdi
        addq    $-2, %rdi
        leaq    -1(%rax, %rbx), %rax
        movq    %rdi, %rbx
        jmp     .L101
        .text
        .align  16
        .globl  camlTest__sum_nums_58
camlTest__sum_nums_58:
.L102:
        movq    %rax, %rbx
        leaq    camlTest__2(%rip), %rax
        movq    $1, %rax
        jmp     camlTest__work_60
        .text
        .align  16
        .globl  camlTest__entry

which branches within the loop if todo<>0 and then back to the start of the 
loop. So this branches twice per loop.

PS: This has nothing to do with consing or continuations.
-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e