Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Fabrice Le Fessant <fabrice@lefessant.net>
To: Alexey Rodriguez <mrchebas@gmail.com>
Cc: OCaml List <caml-list@inria.fr>
Subject: Re: [Caml-list] Re: Exception backtraces and stack overflows
Date: Tue, 17 Jul 2012 00:09:11 +0200	[thread overview]
Message-ID: <CAHvkLrPRBORG6PFuCPwdMiSufjERuG_eG5Zp3-peJ6MXv5O3+w@mail.gmail.com> (raw)
In-Reply-To: <CACfYrzGwnho0tT_6DXXzsoXQSHbnVF=PC7+UTYsHLRH5Gm=4_Q@mail.gmail.com>

The problem for backtraces with SIGSEGV is that the stack trace starts
from the last pointer on the stack on which the system can rely, with
a corresponding return address in the set of OCaml stack frames. The
only such pointer available at the SIGSEGV handler is stored in
caml_bottom_of_stack (with the PC in caml_last_return_address), and
these pointers are updated only when you do a C call. Since in your
program, there is nothing happening before the recursive call, this
pointer is never updated during the recursion and the backtrace only
contains what was on the stack at the last C call.

In your program, you can experiment that by adding "let z = [x] in"
before the recursive call in "my_map", this will allocate something,
and the GC will be triggered at some point, so that you will get the
full backtrace... at least from the point where the GC was called,
before the stack overflow.

Another funny example is to replace the test in "inc" by :
  if n mod 100000 = 0 then print_char 'x';

Then, whatever you do, the backtrace will be restricted to :
Raised at file "pervasives.ml", line 363, characters 19-39
In fact, the stack overflow did not happen in that function (check
using gdb, the backtrace printed by ocaml is actually completely
wrong), but in "my_map": "caml_bottom_of_stack" and
"caml_last_return_address" point to "print_char", so this location is
printed, and then the scan of the stack immediatly stops when it
discovers that the stack does not correspond to that (believing that
it's probably because -g was forgotten).

Maybe this behavior could be improved, at the cost of a more expensive
scan of the stack (as done in bytecode), done only in the case of a
stack overflow.
-Fabrice


On Mon, Jul 16, 2012 at 5:06 PM, Alexey Rodriguez <mrchebas@gmail.com> wrote:
> Hi again,
>
> A colleague suggested doing the following experiment: call List.map on a
> large list and throw an exception from deep down in the call chain.
>
> Now the backtrace I get contains 1022 entries for map, an entry for the
> raise site and some other entry. This matches the 1024 limit of
> BACKTRACE_BUFFER_SIZE. Since the limit has been reached, the backtrace is
> useless to diagnose the stack overflow. This matches my understanding of
> caml_stash_backtrace: all stack frames are inspected and reported as long as
> there is space in the trace buffer.
>
> So it seems there is something funny happening when a stack overflow is
> detected in the SIGSEGV handler:  there are only 3 trace entries whereas the
> stack contains over a hundred thousand frames. Is this intended behavior?
>
> If it is of any help I am including the test program. I am using Ocaml
> 3.12.0 on a x86-64 platform.
>
> Cheers,
>
> Alexey
>
> On Mon, Jul 16, 2012 at 3:51 PM, Alexey Rodriguez <mrchebas@gmail.com>
> wrote:
>>
>> Hi,
>>
>> I am having trouble understanding exception backtraces for stack
>> overflows.
>>
>> Sometimes the backtrace only contains entries for the function that filled
>> the stack with frames (you would see many backtrace entries pointing to
>> List.map if you were trying to map a very long list). Such traces are
>> useless to fix the stack overflow since you cannot use them to find the code
>> path that leads to List.map.
>>
>> In other situations, the backtrace contains the full path from the Ocaml
>> entry point to the recursive functions that is blowing up the stack. In
>> these situations the backtrace appears to have "compressed" the hundreds of
>> thousands of frames that the recursive calls generated since there is only
>> one entry for List.map.
>>
>> Is there documentation that explains when you get one backtrace or the
>> other? I tried to understand the source code of caml_stash_backtrace and
>> there it seems that all the stack frames are captured (if the backtrace
>> buffer size allows). Casual inspection with gdb shows that
>> caml_stash_backtrace does not get the full stack at the moment of the fault.
>> Maybe the signal handler is skipping over the hundreds of thousands of
>> frames somehow? If someone can elucidate this mystery for me I'll be very
>> grateful!
>>
>> I can provide more details if needed, but probably someone on the list can
>> already help with this short description.
>>
>> Oh, one more question on backtraces. I see that when tracing is enabled,
>> caml_stash_backtrace is called whenever an exception is thrown. This might
>> be expensive as Not_found is raised by many functions in the standard
>> library. Is there a high overhead in leaving tracing enabled? This is useful
>> in production systems as very often it is not possible to have the original
>> inputs to trigger the bug in a debug build.
>>
>> Thanks!
>>
>> Alexey
>
>



-- 
Fabrice LE FESSANT

      reply	other threads:[~2012-07-16 22:09 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-16 13:51 [Caml-list] " Alexey Rodriguez
2012-07-16 15:06 ` [Caml-list] " Alexey Rodriguez
2012-07-16 22:09   ` Fabrice Le Fessant [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHvkLrPRBORG6PFuCPwdMiSufjERuG_eG5Zp3-peJ6MXv5O3+w@mail.gmail.com \
    --to=fabrice@lefessant.net \
    --cc=caml-list@inria.fr \
    --cc=mrchebas@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox