From: rixed@happyleptic.org
To: caml-list@inria.fr
Subject: Segfault in ARM EABI for programm compiled with ocamlopt 3.12.0
Date: Wed, 24 Nov 2010 01:20:30 +0100 [thread overview]
Message-ID: <20101124002030.GA9493@yeeloong> (raw)
For some time now I'm after a bug hitting a program of mine when
compiled on ARM with ocaml 3.12.0.
I initially though my own C code was misbehaving but the program keep
crashing, although not as early, if I comment out all calls to the C
functions.
The segfaults happen frequently during the GC, in oldify_one or
oldify_mopup, but also in a few other places such as camlList__rev_append
or caml__apply2 or any other places as well. In caml_oldify_one, for
instance, the segfault always happen at the same location : the
assertion that sz is not 0 (and of course when you read the code it's
pretty clear that sz=0 correspond to the case "already forwarded" that's
handled at the beginning of the function).
The pattern, then, is that a register (usually r0, r2 or r5) is
restored from the stack after a call to a function that might call the
GC (or to a call to the GC itself), then dereferenced. It's obvious
inspecting the stack with gdb that this very word was changed during the
call and a value like 0, 3 or 1024 is read back into the register
instead of an mlvalue.
I didn't managed (yet) to reduce the size of the program to a small show
case, and I am under the impression that all these components are
required in order for the bug to happen 'fast enough' :
- threads
- floats
- call to C function (greatly reduce the time to wait before the crash)
I am also under the impression that the bug is affected by the new stack
alignment requirement (because in one occurrence, calling or not a
function that does nothing from within a function hit by the bug reduced
drastically the probability of the bug, and the major difference I saw
was that on one version of the function the stack size was 16 bytes and
the other 24 bytes (16+4 apparently for the address of a "module"
structure, aligned up to 24 bytes). I thus manually checked the
generated framesets but they were allright as far as I understand them.
Now I'm a little desperate since each recompile+test takes about 20
minutes and the bug is so erratic ; so if someone here is familiar with
ARM arch and in particular the difference between old and new ABI please
suggest me what I should check, or any hint whatsoever. I'd be very much
grateful as this consumes a lot of my spare time.
Also, I'm compiling ocaml with gcc 4.2.1 - do you think it may be a
problem with gcc not following the very same ABI ?
Also I've run the testsuite but it did not reveal anything.
next reply other threads:[~2010-11-24 0:22 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-24 0:20 rixed [this message]
2011-06-29 8:52 ` [Caml-list] " SerP
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101124002030.GA9493@yeeloong \
--to=rixed@happyleptic.org \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox