From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id IAA32265; Mon, 21 Oct 2002 08:58:14 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id IAA32582 for ; Mon, 21 Oct 2002 08:58:13 +0200 (MET DST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id g9L6wC522864; Mon, 21 Oct 2002 08:58:12 +0200 (MET DST) Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id IAA32153; Mon, 21 Oct 2002 08:58:11 +0200 (MET DST) From: Pierre Weis Message-Id: <200210210658.IAA32153@pauillac.inria.fr> Subject: Re: [Caml-list] debugger losing contact with debuggee process In-Reply-To: from Lex Stein at "Oct 20, 102 08:48:39 am" To: stein@eecs.harvard.edu (Lex Stein) Date: Mon, 21 Oct 2002 08:58:11 +0200 (MET DST) Cc: pierre.weis@inria.fr, caml-list@inria.fr X-Mailer: ELM [version 2.4ME+ PL28 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk > > The information that ocamldebug gave you is helpful : it provides the > > mean to go back way before the bus error (Time 290000), and ensures > > that the bus error will appear before Time 300000. To go (go) just before > > the bus error and ask for a backtrace then (bt) is just a matter of > > dichotomy and is very fast. Once your very near the bus error you can > > step use instruction stepping (s) and print (p) and next event (n) to > > understand what happens. > > Yes, but then I need to start up the program again, goto time 290000 and > step through the code until I hit the bus error. Then I need to note which > time it happened at, reload the program again, goto that time, and do a > backtrace. You need not to reload the program (as mentioned in your trace the debugger automatically reconnects to the nearest check point). > The stepping through the code part is of concern to me. There are 10000 > events between time 290000 and 300000. Am I really expect to press "n" > 10000 times (this is the worst case, but the expected number is 5000, > which is still a large number for an interactive human operation)? I think > not. Dichotomy is there to prevent you for this fastidious task. > So I jump forward to 295000, see if the core dump happens between > 290000 and 295000, and repeat. Is this the suggested approach? This is a > binary search to find the location of the fault. I like O(log n) > operations, but I like O(1) operations better. Loading up a core dump file > and doing a backtrace is an O(1) operation for me. Yes, but there is no such feature in the debugger. Sorry for that. > Another concern I have with the approach of goto time 290000 is that the > fault is caused by an external event (receiving an RPC) so it will not > always be at the same time. What would I do first, the goto or trigger the > external event ? Both would seem to be problematic. Yes, our debugger is tuned towards debugging of deterministic programs. The first thing you should do to use it is to tune your program to become deterministic first. Anyhow debugging nn deterministic programs is extremely challenging and we don't know how to do it in general. (Also our debugger is not able to help you for that.) > Your help is appreciated. You're welcome. Regards, Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners