From: Jeffrey Scofield <dynasticon@mac.com>
To: caml-list@inria.fr
Subject: Re: arm backend
Date: Fri, 01 May 2009 15:12:07 -0700 [thread overview]
Message-ID: <m2eiv8tqvc.fsf@mac.com> (raw)
In-Reply-To: <aee06c9e0904301219x7c975305g9b437267ee3e844c@mail.gmail.com>
Nathaniel Gray <n8gray@gmail.com> writes:
> Speaking of which, has anybody built an ocaml cross compiler for the
> iphone that can work with native cocoa touch apps built with the
> official SDK? It's probably too late for my current project but in
> the future I'd love to use ocaml for my iPhone projects. I tried
> following the instructions here[1] with some necessary
> modifications[2] to get the assembler to work but my test app crashed
> as soon as it entered ocaml code. I don't know enough about the ARM
> platform to say why.
Yes, we have OCaml 3.10.2 cross compiling for iPhone OS 2.2.
We started from the instructions you mention:
> [1] http://web.yl.is.s.u-tokyo.ac.jp/~tosh/ocaml-on-iphone/
We made the same change to the .global pseudo-ops:
> [2] I had to change all '.global' to '.globl' in arm.s and
> arm/emit.mlp. I have no idea what that signifies.
(These are just variant spellings of the same pseudo-op
for declaring a global symbol. For whatever reason, the
Apple assembler seems to insist on .globl. Other
incarnations of gas seem to allow either spelling.)
There are at least two more problems, however. Presumably
this is due to differences between the iPhone ABI and the one that
the ARM port (the old one I guess you could say) is targeted for.
1. arm.S uses r10 as a scratch register, but it is not a scratch
register on iPhone. It has to be saved/restored when passing
between OCaml and the native iPhone code (I think of it as
ObjC code). Note, by the way, that gdb shows r10 by the
alternate name of sl. This is confusing at first.
2. arm.S assumes r9 can be used as a general purpose register,
but it is used on the iPhone to hold a global thread context.
Again, it has to be saved/restored (or at least that's what we
decided to do).
We saw crashes caused by both of these problems.
I'm appending a new version of arm.S that works for us with
one OCaml thread. (Multiple threads will almost certainly
require more careful handling of r9.) It has the patches
from Toshiyuki Maeda mentioned above and a few of our
own to fix these two problems.
We have an application that has been working well for
a couple months, so there's some evidence that these
changes are sufficient.
We also made a small fix to the ARM code generator
(beyond the patches from Toshiyuki Maeda). In essence,
it fixes up the handling of unboxed floating return
values of external functions. Things mostly work without
this change; I'll save a description for a later post (if
anybody is interested).
Regards,
Jeff Scofield
Seattle
---------- 8< ---- cut here for arm.S ----- >8 -----
/***********************************************************************/
/* */
/* Objective Caml */
/* */
/* Xavier Leroy, projet Cristal, INRIA Rocquencourt */
/* */
/* Copyright 1998 Institut National de Recherche en Informatique et */
/* en Automatique. All rights reserved. This file is distributed */
/* under the terms of the GNU Library General Public License, with */
/* the special exception on linking described in file ../LICENSE. */
/* */
/***********************************************************************/
/* $Id: arm.S,v 1.15.18.1 2008/02/20 12:25:17 xleroy Exp $ */
/* Linux/BSD with ELF binaries and Solaris do not prefix identifiers with _.
Linux/BSD with a.out binaries and NextStep do.
Copied from asmrun/i386.S */
#if defined(SYS_solaris)
#define CONCAT(a,b) a/**/b
#else
#define CONCAT(a,b) a##b
#endif
#if defined(SYS_linux_elf) || defined(SYS_bsd_elf) \
|| defined(SYS_solaris) || defined(SYS_beos) || defined(SYS_gnu)
#define G(x) x
#define LBL(x) CONCAT(.L,x)
#else
#define G(x) CONCAT(_,x)
#define LBL(x) CONCAT(L,x)
#endif
#if defined(SYS_macosx)
#define global globl
#endif
/* Asm part of the runtime system, ARM processor */
#define trap_ptr r11
#define alloc_ptr r8
#define alloc_limit r9
#define sp r13
#define lr r14
#define pc r15
.text
/* Allocation functions and GC interface */
.global G(caml_call_gc)
G(caml_call_gc):
/* Record return address */
/* We can use r10 as a temp reg since it's not live here */
ldr r10, LBL(caml_last_return_address)
str lr, [r10, #0]
/* Branch to shared GC code */
bl LBL(invoke_gc)
/* Restart allocation sequence (4 instructions before) */
sub lr, lr, #16
mov pc, lr
.global G(caml_alloc1)
G(caml_alloc1):
ldr r10, [alloc_limit, #0]
sub alloc_ptr, alloc_ptr, #8
cmp alloc_ptr, r10
movcs pc, lr /* Return if alloc_ptr >= alloc_limit */
/* Record return address */
ldr r10, LBL(caml_last_return_address)
str lr, [r10, #0]
/* Invoke GC */
bl LBL(invoke_gc)
/* Try again */
b G(caml_alloc1)
.global G(caml_alloc2)
G(caml_alloc2):
ldr r10, [alloc_limit, #0]
sub alloc_ptr, alloc_ptr, #12
cmp alloc_ptr, r10
movcs pc, lr /* Return if alloc_ptr >= alloc_limit */
/* Record return address */
ldr r10, LBL(caml_last_return_address)
str lr, [r10, #0]
/* Invoke GC */
bl LBL(invoke_gc)
/* Try again */
b G(caml_alloc2)
.global G(caml_alloc3)
G(caml_alloc3):
ldr r10, [alloc_limit, #0]
sub alloc_ptr, alloc_ptr, #16
cmp alloc_ptr, r10
movcs pc, lr /* Return if alloc_ptr >= alloc_limit */
/* Record return address */
ldr r10, LBL(caml_last_return_address)
str lr, [r10, #0]
/* Invoke GC */
bl LBL(invoke_gc)
/* Try again */
b G(caml_alloc3)
.global G(caml_allocN)
G(caml_allocN):
str r12, [sp, #-4]!
ldr r12, [alloc_limit, #0]
sub alloc_ptr, alloc_ptr, r10
cmp alloc_ptr, r12
ldr r12, [sp], #4
movcs pc, lr /* Return if alloc_ptr >= alloc_limit */
/* Record return address and desired size */
ldr alloc_limit, LBL(caml_last_return_address)
str lr, [alloc_limit, #0]
ldr alloc_limit, LBL(Lcaml_requested_size)
str r10, [alloc_limit, #0]
/* Invoke GC */
bl LBL(invoke_gc)
/* Try again */
ldr r10, LBL(Lcaml_requested_size)
ldr r10, [r10, #0]
b G(caml_allocN)
/* Shared code to invoke the GC */
LBL(invoke_gc):
/* Record lowest stack address */
ldr r10, LBL(caml_bottom_of_stack)
str sp, [r10, #0]
/* Save integer registers and return address on stack */
stmfd sp!, {r0,r1,r2,r3,r4,r5,r6,r7,r10,r12,lr}
/* Store pointer to saved integer registers in caml_gc_regs */
ldr r10, LBL(caml_gc_regs)
str sp, [r10, #0]
/* Save non-callee-save float registers */
sub sp, sp, #64
fstd d0, [sp, #56]
fstd d1, [sp, #48]
fstd d2, [sp, #40]
fstd d3, [sp, #32]
fstd d4, [sp, #24]
fstd d5, [sp, #16]
fstd d6, [sp, #8]
fstd d7, [sp, #0]
/* Save current allocation pointer for debugging purposes */
ldr r10, LBL(caml_young_ptr)
str alloc_ptr, [r10, #0]
/* Save trap pointer in case an exception is raised during GC */
ldr r10, LBL(caml_exception_pointer)
str trap_ptr, [r10, #0]
/* Restore r9 for iPhoneOS */
ldr r9, LBL(Lcaml_touch_threadctx) /* iPhone */
ldr r9, [r9, #0] /* iPhone */
/* Call the garbage collector */
bl G(caml_garbage_collection)
/* Restore the registers from the stack */
fldd d7, [sp, #0]
fldd d6, [sp, #8]
fldd d5, [sp, #16]
fldd d4, [sp, #24]
fldd d3, [sp, #32]
fldd d2, [sp, #40]
fldd d1, [sp, #48]
fldd d0, [sp, #56]
add sp, sp, #64
ldmfd sp!, {r0,r1,r2,r3,r4,r5,r6,r7,r10,r12}
/* Reload return address */
ldr r10, LBL(caml_last_return_address)
ldr lr, [r10, #0]
/* Say that we are back into Caml code */
mov alloc_ptr, #0
str alloc_ptr, [r10, #0]
/* Reload new allocation pointer and allocation limit */
ldr r10, LBL(caml_young_ptr)
ldr alloc_ptr, [r10, #0]
ldr alloc_limit, LBL(caml_young_limit)
/* Return to caller */
ldmfd sp!, {pc}
/* Call a C function from Caml */
/* Function to call is in r10 */
.global G(caml_c_call)
G(caml_c_call):
/* Preserve return address in callee-save register r4 */
mov r4, lr
/* Record lowest stack address and return address */
ldr r5, LBL(caml_last_return_address)
ldr r6, LBL(caml_bottom_of_stack)
str lr, [r5, #0]
str sp, [r6, #0]
/* Make the exception handler and alloc ptr available to the C code */
ldr r6, LBL(caml_young_ptr)
ldr r7, LBL(caml_exception_pointer)
str alloc_ptr, [r6, #0]
str trap_ptr, [r7, #0]
ldr r9, LBL(Lcaml_touch_threadctx) /* iPhone */
ldr r9, [r9, #0] /* iPhone */
/* Call the function */
mov lr, pc
mov pc, r10
/* Reload alloc ptr */
ldr alloc_ptr, [r6, #0] /* r6 still points to caml_young_ptr */
/* Say that we are back into Caml code */
mov r6, #0
str r6, [r5, #0] /* r5 still points to caml_last_return_address */
/* Return */
mov pc, r4
/* Start the Caml program */
.global G(caml_start_program)
G(caml_start_program):
stmfd sp!, {r10}
ldr r10, LBL(Lcaml_touch_threadctx) /* iPhone */
str r9, [r10, #0] /* iPhone */
ldr r10, LBL(caml_program)
/* Code shared with caml_callback* */
/* Address of Caml code to call is in r10 */
/* Arguments to the Caml code are in r0...r3 */
LBL(jump_to_caml):
/* Save return address and callee-save registers */
stmfd sp!, {r4,r5,r6,r7,r8,r9,r11,lr}
sub sp, sp, #64
fstd d15, [sp, #56]
fstd d14, [sp, #48]
fstd d13, [sp, #40]
fstd d12, [sp, #32]
fstd d11, [sp, #24]
fstd d10, [sp, #16]
fstd d9, [sp, #8]
fstd d8, [sp, #0]
/* Setup a callback link on the stack */
sub sp, sp, #(4*3)
ldr r4, LBL(caml_bottom_of_stack)
ldr r4, [r4, #0]
str r4, [sp, #0]
ldr r4, LBL(caml_last_return_address)
ldr r4, [r4, #0]
str r4, [sp, #4]
ldr r4, LBL(caml_gc_regs)
ldr r4, [r4, #0]
str r4, [sp, #8]
/* Setup a trap frame to catch exceptions escaping the Caml code */
sub sp, sp, #(4*2)
ldr r4, LBL(caml_exception_pointer)
ldr r4, [r4, #0]
str r4, [sp, #0]
ldr r4, LBL(Ltrap_handler)
str r4, [sp, #4]
mov trap_ptr, sp
/* Reload allocation pointers */
ldr r4, LBL(caml_young_ptr)
ldr alloc_ptr, [r4, #0]
ldr alloc_limit, LBL(caml_young_limit)
/* We are back into Caml code */
ldr r4, LBL(caml_last_return_address)
mov r5, #0
str r5, [r4, #0]
/* Call the Caml code */
mov lr, pc
mov pc, r10
LBL(caml_retaddr):
/* Pop the trap frame, restoring caml_exception_pointer */
ldr r4, LBL(caml_exception_pointer)
ldr r5, [sp, #0]
str r5, [r4, #0]
add sp, sp, #(2 * 4)
/* Pop the callback link, restoring the global variables */
LBL(return_result):
ldr r4, LBL(caml_bottom_of_stack)
ldr r5, [sp, #0]
str r5, [r4, #0]
ldr r4, LBL(caml_last_return_address)
ldr r5, [sp, #4]
str r5, [r4, #0]
ldr r4, LBL(caml_gc_regs)
ldr r5, [sp, #8]
str r5, [r4, #0]
add sp, sp, #(4*3)
/* Update allocation pointer */
ldr r4, LBL(caml_young_ptr)
str alloc_ptr, [r4, #0]
/* Reload callee-save registers and return */
fldd d8, [sp, #0]
fldd d9, [sp, #8]
fldd d10, [sp, #16]
fldd d11, [sp, #24]
fldd d12, [sp, #32]
fldd d13, [sp, #40]
fldd d14, [sp, #48]
fldd d15, [sp, #56]
add sp, sp, #64
ldmfd sp!, {r4,r5,r6,r7,r8,r9,r11,lr}
ldmfd sp!, {r10}
mov pc, lr
/* The trap handler */
LBL(trap_handler):
/* Save exception pointer */
ldr r4, LBL(caml_exception_pointer)
str trap_ptr, [r4, #0]
/* Encode exception bucket as an exception result */
orr r0, r0, #2
/* Return it */
b LBL(return_result)
/* Raise an exception from C */
.global G(caml_raise_exception)
G(caml_raise_exception):
/* Reload Caml allocation pointers */
ldr r1, LBL(caml_young_ptr)
ldr alloc_ptr, [r1, #0]
ldr alloc_limit, LBL(caml_young_limit)
/* Say we're back into Caml */
ldr r1, LBL(caml_last_return_address)
mov r2, #0
str r2, [r1, #0]
/* Cut stack at current trap handler */
ldr r1, LBL(caml_exception_pointer)
ldr sp, [r1, #0]
/* Pop previous handler and addr of trap, and jump to it */
ldmfd sp!, {trap_ptr, pc}
/* Callback from C to Caml */
.global G(caml_callback_exn)
G(caml_callback_exn):
/* Initial shuffling of arguments (r0 = closure, r1 = first arg) */
stmfd sp!, {r10}
mov r10, r0
mov r0, r1 /* r0 = first arg */
mov r1, r10 /* r1 = closure environment */
ldr r10, [r10, #0] /* code pointer */
b LBL(jump_to_caml)
.global G(caml_callback2_exn)
G(caml_callback2_exn):
/* Initial shuffling of arguments (r0 = closure, r1 = arg1, r2 = arg2) */
stmfd sp!, {r10}
mov r10, r0
mov r0, r1 /* r0 = first arg */
mov r1, r2 /* r1 = second arg */
mov r2, r10 /* r2 = closure environment */
ldr r10, LBL(caml_apply2)
b LBL(jump_to_caml)
.global G(caml_callback3_exn)
G(caml_callback3_exn):
/* Initial shuffling of arguments */
/* (r0 = closure, r1 = arg1, r2 = arg2, r3 = arg3) */
stmfd sp!, {r10}
mov r10, r0
mov r0, r1 /* r0 = first arg */
mov r1, r2 /* r1 = second arg */
mov r2, r3 /* r2 = third arg */
mov r3, r10 /* r3 = closure environment */
ldr r10, LBL(caml_apply3)
b LBL(jump_to_caml)
.global G(caml_ml_array_bound_error)
G(caml_ml_array_bound_error):
/* Load address of [caml_array_bound_error] in r10 */
ldr r10, LBL(caml_array_bound_error)
/* Call that function */
b G(caml_c_call)
/* Global references */
LBL(caml_last_return_address): .long G(caml_last_return_address)
LBL(caml_bottom_of_stack): .long G(caml_bottom_of_stack)
LBL(caml_gc_regs): .long G(caml_gc_regs)
LBL(caml_young_ptr): .long G(caml_young_ptr)
LBL(caml_young_limit): .long G(caml_young_limit)
LBL(caml_exception_pointer): .long G(caml_exception_pointer)
LBL(caml_program): .long G(caml_program)
LBL(Ltrap_handler): .long LBL(trap_handler)
LBL(caml_apply2): .long G(caml_apply2)
LBL(caml_apply3): .long G(caml_apply3)
LBL(Lcaml_requested_size): .long LBL(caml_requested_size)
LBL(caml_array_bound_error): .long G(caml_array_bound_error)
LBL(Lcaml_touch_threadctx): .long LBL(caml_touch_threadctx)
.data
LBL(caml_requested_size): .long 0
LBL(caml_touch_threadctx): .long 0
/* GC roots for callback */
.data
.global G(caml_system__frametable)
G(caml_system__frametable):
.long 1 /* one descriptor */
.long LBL(caml_retaddr) /* return address into callback */
.short -1 /* negative frame size => use callback link */
.short 0 /* no roots */
.align 2
#if defined(SYS_macosx)
.text
.global G(__stub__modsi3)
G(__stub__modsi3):
b LBL(__stub__modsi3)
.global G(__stub__divsi3)
G(__stub__divsi3):
b LBL(__stub__divsi3)
.section __TEXT,__picsymbolstub4,symbol_stubs,none,16
.align 2
LBL(__stub__modsi3):
.indirect_symbol G(__modsi3)
ldr ip, LBL(__stub__modsi3$slp)
LBL(__stub__modsi3$scv):
add ip, pc, ip
ldr pc, [ip, #0]
LBL(__stub__modsi3$slp):
.long LBL(__stub__modsi3$lazy_ptr) - (LBL(__stub__modsi3$scv) + 8)
.lazy_symbol_pointer
LBL(__stub__modsi3$lazy_ptr):
.indirect_symbol G(__modsi3)
.long dyld_stub_binding_helper
.section __TEXT,__picsymbolstub4,symbol_stubs,none,16
.align 2
LBL(__stub__divsi3):
.indirect_symbol G(__divsi3)
ldr ip, LBL(__stub__divsi3$slp)
LBL(__stub__divsi3$scv):
add ip, pc, ip
ldr pc, [ip, #0]
LBL(__stub__divsi3$slp):
.long LBL(__stub__divsi3$lazy_ptr) - (LBL(__stub__divsi3$scv) + 8)
.lazy_symbol_pointer
LBL(__stub__divsi3$lazy_ptr):
.indirect_symbol G(__divsi3)
.long dyld_stub_binding_helper
.subsections_via_symbols
#endif
next prev parent reply other threads:[~2009-05-01 22:12 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-30 12:14 Joel Reymont
2009-04-30 13:28 ` Sylvain Le Gall
2009-04-30 18:03 ` [Caml-list] " Stéphane Glondu
2009-04-30 19:19 ` Nathaniel Gray
2009-05-01 12:02 ` Mattias Engdegård
2009-05-01 18:27 ` Nathaniel Gray
2009-05-01 19:24 ` Mattias Engdegård
2009-05-01 22:12 ` Jeffrey Scofield [this message]
2009-05-02 0:07 ` Nathaniel Gray
2009-05-02 23:15 ` OCaml on iPhone (was: arm backend) Jeffrey Scofield
2009-05-03 12:34 ` [Caml-list] " Robert Muller
2009-05-05 4:59 ` OCaml on iPhone Jeffrey Scofield
2009-05-05 9:43 ` [Caml-list] arm backend Xavier Leroy
2009-05-05 18:21 ` Nathaniel Gray
2009-05-06 3:56 ` Jeffrey Scofield
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2eiv8tqvc.fsf@mac.com \
--to=dynasticon@mac.com \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox