* [Caml-list] Segfault in a native code multi-threaded program @ 2001-08-16 8:08 David Mentre 2001-08-16 12:45 ` Vitaly Lugovsky [not found] ` <3B7C5A74.15141D95@maxtal.com.au> 0 siblings, 2 replies; 11+ messages in thread From: David Mentre @ 2001-08-16 8:08 UTC (permalink / raw) To: caml-list Hello dear Camlers, My multi-threaded program works well with the byte code compiler but (sometimes) produces a segfault with the native code compiler. How can I have more info to find the specific line of code that produces this segfault? I've tried to run gdb but even when the program works until its end, gdb is blocked. My environment: x86 Linux. Quoting the FAQ[1], they are several possible issues: > * when accessing out of range in a vector or string, when the compilers > does not generate bound checking (due to explicit user's request), No. Standard compilation with all checks. > * when attempting to perform an illegal floating point operation > (division by 0), on some processors running under some OSes (e.g. alpha > processor under True64 Unix), No. No floating point operations. > * when the program is looping and consumes the whole memory for the > execution stack, when the overflow check cannot detect this situation > (for instance, in case of a native code program running under > Unix), I think no. My program is running well in byte code. > * when using ``magic coercion'' from the Obj module, No. I don't use this feature. > * in case of erroneous usage (i.e. ill-typed usage) of marshalling > primitives output_value, input_value, etc, > Err, maybe. However, how to find where? And my program is working in byte code! > * when calling user's defined external functions (for instance written in > C) and when the interface is wrong (since effective types of primitives > are not the types declared to the Caml compiler). > No. No external functions. >Very often, you should use the Caml debugger to precisely find where in your >code the error is occurring, and then correct the program. I can't use the debugger as it does not work with multi-threaded programs. Best regards, david [1] http://caml.inria.fr/FAQ/FAQ_EXPERT-eng.html#bus_error -- David.Mentre@inria.fr Opinions expressed here are only mine. ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code multi-threaded program 2001-08-16 8:08 [Caml-list] Segfault in a native code multi-threaded program David Mentre @ 2001-08-16 12:45 ` Vitaly Lugovsky 2001-08-17 8:09 ` David Mentre [not found] ` <3B7C5A74.15141D95@maxtal.com.au> 1 sibling, 1 reply; 11+ messages in thread From: Vitaly Lugovsky @ 2001-08-16 12:45 UTC (permalink / raw) To: David Mentre; +Cc: caml-list On 16 Aug 2001, David Mentre wrote: > Hello dear Camlers, > > My multi-threaded program works well with the byte code compiler but > (sometimes) produces a segfault with the native code compiler. How can I > have more info to find the specific line of code that produces this > segfault? I've tried to run gdb but even when the program works until > its end, gdb is blocked. You can try gdb -c core <progname> after segfault even in multithreaded environment. And, sure, you can use assertions and debugging output (do not forget to flush it) to find out where the shit happens. ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code multi-threaded program 2001-08-16 12:45 ` Vitaly Lugovsky @ 2001-08-17 8:09 ` David Mentre 2001-08-17 16:54 ` Brian Rogoff 2001-08-18 23:15 ` John Max Skaller 0 siblings, 2 replies; 11+ messages in thread From: David Mentre @ 2001-08-17 8:09 UTC (permalink / raw) To: Vitaly Lugovsky; +Cc: caml-list Vitaly Lugovsky <vsl@ontil.ihep.su> writes: > You can try gdb -c core <progname> after segfault even in multithreaded > environment. Oh yes, I've forgotten about that. Thanks. > And, sure, you can use assertions and debugging output (do not forget to > flush it) to find out where the shit happens. True. However I wondered if they were more elaborated debugging techniques before relying on printf. My program now produces a segfault even in byte code mode. :( Probably a misuse of Marshal. I've typed all of its input/output uses but I've probably messed things between a marshal and its unmarshal counter part. <hint for next ocaml ;)> It would be very nice to be able to rely on Marshal as safely as on ocaml typing. Just to be sure that if I expect an int * string, I will effectively receive an int * string or raise an exception. It could probably be done using the same tricks as used in printf formatters. </hint for next ocaml ;)> Best regards, d. -- David.Mentre@inria.fr Opinions expressed here are only mine. ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code multi-threaded program 2001-08-17 8:09 ` David Mentre @ 2001-08-17 16:54 ` Brian Rogoff 2001-08-20 9:12 ` David Mentre 2001-08-20 16:54 ` Jun P. FURUSE 2001-08-18 23:15 ` John Max Skaller 1 sibling, 2 replies; 11+ messages in thread From: Brian Rogoff @ 2001-08-17 16:54 UTC (permalink / raw) To: caml-list On 17 Aug 2001, David Mentre wrote: > My program now produces a segfault even in byte code mode. :( > > Probably a misuse of Marshal. I've typed all of its input/output uses > but I've probably messed things between a marshal and its unmarshal > counter part. We use Marshal a lot too and I definitely rely on the rule of thumb "If OCaml dumps core it is a Marshal or C code issue". Where I get hit is not so much in having marshal/unmarshal get out of sync but in leaving around old versions of marshaled data files and trying to read those. The problem manifests itself quickly, but the core dump is unpleasant :-). > > <hint for next ocaml ;)> > > It would be very nice to be able to rely on Marshal as safely as on > ocaml typing. Just to be sure that if I expect an int * string, I will > effectively receive an int * string or raise an exception. It could > probably be done using the same tricks as used in printf formatters. > > </hint for next ocaml ;)> A more type safe marshaling framework is way up there on my list of desired enhancements. I think this will be part of the extensional polymorphism enhancements that you can see now in G'Caml. -- Brian ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code multi-threaded program 2001-08-17 16:54 ` Brian Rogoff @ 2001-08-20 9:12 ` David Mentre 2001-08-20 16:54 ` Jun P. FURUSE 1 sibling, 0 replies; 11+ messages in thread From: David Mentre @ 2001-08-20 9:12 UTC (permalink / raw) To: Brian Rogoff; +Cc: caml-list Brian Rogoff <bpr@best.com> writes: > > > <hint for next ocaml ;)> > > > It would be very nice to be able to rely on Marshal as safely as on [...] > > </hint for next ocaml ;)> > > A more type safe marshaling framework is way up there on my list of > desired enhancements. I think this will be part of the extensional > polymorphism enhancements that you can see now in G'Caml. In my opinion, they are two different kinds of marshaling that one could expect: the safe one and the correct one. Supose that I have : type t = A | B of int | C of float To be safe is just to ensure that if I marshal a B(3), then if I receive a B(2.6) it will raise an exception. In other words, the marshaled data structure should contain enough information to ensure that once unmarshaled, it won't trigger a segfault. That's all. Basically, it means to encode in some way the type constructor of sent data structure (in my example, something like (type t: first: nothing; second: int; third: float)), and to check those constructors at unmarshaling time. The correct encoding is of course to ensure that the A of my program is the same A of the .mli you are using to send to me a data structure. It is of course much more difficult. I have probably missed something on the difficulty of marshaling. I tried to look at the mailing list archive, but did not find out the relevant messages. Any pointer? Best regards, d. -- David.Mentre@inria.fr Opinions expressed here are only mine. ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code multi-threaded program 2001-08-17 16:54 ` Brian Rogoff 2001-08-20 9:12 ` David Mentre @ 2001-08-20 16:54 ` Jun P. FURUSE 1 sibling, 0 replies; 11+ messages in thread From: Jun P. FURUSE @ 2001-08-20 16:54 UTC (permalink / raw) To: bpr; +Cc: caml-list Hello, > > > <hint for next ocaml ;)> > > > It would be very nice to be able to rely on Marshal as safely as on > > ocaml typing. Just to be sure that if I expect an int * string, I will > > effectively receive an int * string or raise an exception. It could > > probably be done using the same tricks as used in printf formatters. > > > > </hint for next ocaml ;)> > > A more type safe marshaling framework is way up there on my list of > desired enhancements. I think this will be part of the extensional > polymorphism enhancements that you can see now in G'Caml. Yes. I did not announce this feature since it did not work in the last version. I quickly fixed it and replaced the source archive at http://pauillac.inria.fr/~furuse/generics/index.html As always, I have to warn that this is quite experimental and therefore may contain MANY BUGS... And it is based on O'Caml 2.02. This "safe value I/O" facility consists of two primitive functions, export_value and import_value. They can be replacements of the O'Caml's original value I/O functions, output_value and input_value: # export_value;; - : out_channel -> $a -> unit = <fun> # import_value;; - : in_channel -> $a = <fun> export_value primitive writes an ML value with its encoded type information to the channel. import_value read its value and type information and checks it matches with the current type context. If the type of output value is more general than the expected type, it permits the value importation. Otherwise, it raises an exception: # let oc = open_out_bin "foo.dat" in export_value oc (1,"hello");; - : unit = () It writes out the value (1,"hello") and the fingerprint of its type int * string. The import_value primitive compares this fingerprint with the expected type: # let ic = open_in_bin "foo.dat" in (import_value ic : int * string);; - : int * string = 1, "hello" # let ic = open_in_bin "foo.dat" in (import_value ic : int * float);; - : int * string = 1, "hello" Uncaught exception: Extern.Type_match_failure(...) Programs exchanging values need not to be same. Moreover, you hackers may be able to exchange values safely even between different types, if their type definition structures are isomorphic to each other except differences of their names. For example, these two type definitions have isomorphic: type 'a tree = Branch of 'a tree * 'a tree | Leaf of 'a type 'a arbre = Branche of 'a arbre * 'a arbre | Feuille of 'a The drawback is that your programs must carry the digests of data types (= fingerprints) at run time. This costs some amount of space, but it is usually small compared to the other part. Regards, ----------------------------------------------------------------------- Jun P. Furuse Jun.Furuse@inria.fr ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code multi-threaded program 2001-08-17 8:09 ` David Mentre 2001-08-17 16:54 ` Brian Rogoff @ 2001-08-18 23:15 ` John Max Skaller 2001-08-19 0:24 ` John Gerard Malecki 1 sibling, 1 reply; 11+ messages in thread From: John Max Skaller @ 2001-08-18 23:15 UTC (permalink / raw) To: David Mentre; +Cc: Vitaly Lugovsky, caml-list > <hint for next ocaml ;)> > > It would be very nice to be able to rely on Marshal as safely as on > ocaml typing. Just to be sure that if I expect an int * string, I will > effectively receive an int * string or raise an exception. It could > probably be done using the same tricks as used in printf formatters. > > </hint for next ocaml ;)> As I understand it, this is a non-trivial research problem. -- John (Max) Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850 New generation programming language Felix http://felix.sourceforge.net Literate Programming tool Interscript http://Interscript.sourceforge.net ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code multi-threaded program 2001-08-18 23:15 ` John Max Skaller @ 2001-08-19 0:24 ` John Gerard Malecki 2001-08-19 18:18 ` [Caml-list] Ocaml ffi release 1.2.2 dsfox 0 siblings, 1 reply; 11+ messages in thread From: John Gerard Malecki @ 2001-08-19 0:24 UTC (permalink / raw) To: caml-list John Max Skaller wrote (2001-08-19T09:15:44+1000): > > > <hint for next ocaml ;)> > > > > It would be very nice to be able to rely on Marshal as safely as on > > ocaml typing. Just to be sure that if I expect an int * string, I will > > effectively receive an int * string or raise an exception. It could > > probably be done using the same tricks as used in printf formatters. > > > > </hint for next ocaml ;)> > > As I understand it, this is a non-trivial research problem. Yes it is but in the mean-time we are stuck with the problem. I'll describe the IMPERFECT solution i use. If anyone else has something better please do describe it. The idea is to use the already existing extract_crc program to get the digest of the data-structure to be written. There are obvious flaws including the fact that the value to be written must have a concrete signature and the programmer must ensure that all of the accessible signatures are included. Assuming that one didn't make any mistakes this should catch reading an "out of version" marshaled value. Here is the text from an email i wrote a while ago describing the same mechanism followed by all of the code to make a working example. Under some assumptions about the things that you are writing out extract_crc can provide some support. The flaw is that extract_crc is not "deep". If you create a single .mli file which only contains the signature of the value you are to write then you can, with some additional makefile complexity, automatically generate an extract_crc .ml from that .mli file. For example, here is the output from extract_crc of the cell.mli file let crc_unit_list = [ "Cell", "\032\036\180\144\173\052\208\140\081\102\211\172\198\229\098\218" ] Instead of simply doing an output value you could ALWAYS do an output value of crc_unit_list and then execute let safe_output_value safe oc data = output_value oc safe; output_value oc data let safe_input_value safe ic = let safe_input = input_value ic in if safe_input = safe then input_value ic else raise (Sys_error "safe_input_value") The majority of the work is in (makefile) procedures for automatically generating the crc_unit_list. One must be careful about the dependencies to make sure that the makefile can both bootstrap and always keep the crc_unit_list file up to date. At one time i thought of using the dynlink module to solve these problems but it turns out that there is no real advantage as the majority of the work is in the makefile. Here is a sample Makefile RESULT := a.out SOURCES := safety.ml test.ml all: safety.ml byte-code include OcamlMakefile EXTRACT_CRC := $(shell $(OCAMLC) -where)/extract_crc safety.ml: test.cmi $(EXTRACT_CRC) test > $@ check: all OCAMLRUNPARAM='b=1' ./$(RESULT) The file test.mli type t = (int * int) list and the file test.ml type t = (int * int) list let safe_output_value safe oc data = output_value oc safe; output_value oc data let safe_input_value safe ic = let safe_input = input_value ic in if safe_input = safe then input_value ic else raise (Sys_error "safe_input_value") let _ = let data = [ 0,0; 1,1 ] in let oc = open_out_bin "test.db" in safe_output_value Safety.crc_unit_list oc data; close_out oc let _ = let ic = open_in_bin "test.db" in let data = safe_input_value Safety.crc_unit_list ic in close_in ic; data -cheers ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Caml-list] Ocaml ffi release 1.2.2 2001-08-19 0:24 ` John Gerard Malecki @ 2001-08-19 18:18 ` dsfox 0 siblings, 0 replies; 11+ messages in thread From: dsfox @ 2001-08-19 18:18 UTC (permalink / raw) To: caml-list I've released a new version of the foreign function generator. Future releases will be announced at the sourceforge project page, which can be reached through http://ocamlffi.sourceforge.net. ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <3B7C5A74.15141D95@maxtal.com.au>]
[parent not found: <qtlitfm6ajh.fsf@pochi.inria.fr>]
[parent not found: <3B7F0C99.FEDEAA86@maxtal.com.au>]
[parent not found: <qtl3d6m4vkt.fsf@pochi.inria.fr>]
* Re: [Caml-list] Segfault in a native code program using pm variants [not found] ` <qtl3d6m4vkt.fsf@pochi.inria.fr> @ 2001-08-21 19:20 ` John Max Skaller 2001-08-23 8:54 ` Xavier Leroy 0 siblings, 1 reply; 11+ messages in thread From: John Max Skaller @ 2001-08-21 19:20 UTC (permalink / raw) To: David Mentre, caml-list I hope it's not catching. Just after reproducing David Mentre's segfault, now I'm getting one in my code too ;-) Data: Ocaml 3.02, Linux on PIII 550E, i86 native code compiler, compiled with pthreads enabled. I am using the bignum module, but no threads, no magic, no marshal, no special integer types, no floats, and no non-standard modules. The application is the Felix compiler. Parsing and desugaring passes have run (correctly, as far as I can tell) Noramlly, I'd suspect an infinite recursion, (the code is very recursive, and I actually _expected_ it) but it happens very quickly without disk thrashing (I'm running as root), and all attempts to trace the recursion using prints have failed (so far). I've just got this code to compile after a significant rewrite in which I also switched almost all variant usage to polymorphic variants. What I am seeing, however, is an incorrect match using polymorphic variants: here's the debugging output: typeofindex finds declaration proc print: int = "printf(\"%d\",$1);"; Found var :--->declaration proc print: int = "printf(\"%d\",$1);"; That 'found var' comes from and typeofindex (dfns:symbol_table_t) (counter:int ref) (freg:instantiation_registry_t) (exclude:int list) (excluded_dirs:int list) (index:int) : btypecode_t = ... let bt t:btypecode_t = print_endline ("TYPEOFINDEX binding type " ^ string_of_typecode t); flush stdout; let t' = bind_type dfns counter freg env (index::exclude) excluded_dirs sr t in print_endline "TYPE BOUND"; t' in ... print_endline ("typeofindex finds " ^ string_of_symdef entry id); flush stdout; match entry with .... | `SYMDEF_dcl (`DCL_var t) -> print_endline "Found var"; print_endline (":--->" ^ string_of_symdef entry id); flush stdout; bt t but the print routine looks like: and string_of_symdef entry name = match entry with ..... | `SYMDEF_dcl d -> "declaration " ^ string_of_dcl 0 name d .... and string_of_dcl level name s = let se e = string_of_expr e in match s with ... | `DCL_proc (args, code) -> spaces level ^ "proc " ^ name ^ ": " ^ (string_of_typecode (type_of_argtypes args)) ^ " = " ^ string_of_string code ^ ";" ... | `DCL_var (ty) -> spaces level ^ "var " ^ name ^ ": " ^ string_of_typecode ty ^ ";" The diagnostic listed is the last thing I see before the segfault. The next call, bt t should have executed: print_endline ("TYPEOFINDEX binding type " ^ string_of_typecode t); flush stdout; but hasn't. It would probably have crashed if it did, since the type of the variant being passed is wrong. (Did it crash forming the closure?) The routine, is, in fact, analysing the "proc print" entry, so the diagnostic output is correct, and it's the match in the 'typeofindex' routine that is failing. I've submitted a bug report, but bugs in Ocaml are rare enough for me to still believe it is possible that I'm doing something wrong -- but I have no idea what to try next. The source is too large for a bug report, and I haven't been able to reproduce the problem with a tiny test case involving nested PM variants. It's possible that my problem is related to Davids (and has nothing to do with PM variants, which David is not using). As far as I can remember, my code built fine using 3.01, but it has been rewritten in the last few weeks: and I installed 3.02 on Aug 4. -- John (Max) Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850 New generation programming language Felix http://felix.sourceforge.net Literate Programming tool Interscript http://Interscript.sourceforge.net ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Segfault in a native code program using pm variants 2001-08-21 19:20 ` [Caml-list] Segfault in a native code program using pm variants John Max Skaller @ 2001-08-23 8:54 ` Xavier Leroy 0 siblings, 0 replies; 11+ messages in thread From: Xavier Leroy @ 2001-08-23 8:54 UTC (permalink / raw) To: John Max Skaller; +Cc: David Mentre, caml-list John, David, and everyone else on this list, Please don't post this kind of message ("my OCaml program crashed, what's happening?") to caml-list. You're just annoying 500 subscribers who can't do anything about it. And even the Caml developers who are on the list can't do anything about it either as long as you don't provide a complete program that reproduces the crash. Here's how to proceed in this case: 0- Make sure that your program doesn't use unsafe features of OCaml: unchecked array and string accesses, Obj and Marshal functions, external C functions. If it does, look hard at those parts of your code and convince yourself that the bug is not there. 1- Package a program that reproduces the crash. It doesn't have to be small -- it's often hard to reduce a crash to a one-page fragment, and it's often not necessary. But make sure you include test inputs and commands that reproduce the crash. 2- Send a bug report to caml-bugs@inria.fr. If your package is large, better include a URL than attach it to your bug report, as our bug tracking system truncates large messages and doesn't handle attachments gracefully. Or, e-mail caml@inria.fr with your code attached. If you follow those three easy steps, you can rest assured that the Caml developers will investigate the issue quickly, and if it turns out to be a bug in OCaml, they'll fix it. If you don't, there is essentially nothing we can do to help you... Thanks for your attention, - Xavier Leroy ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2001-08-23 8:54 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-08-16 8:08 [Caml-list] Segfault in a native code multi-threaded program David Mentre 2001-08-16 12:45 ` Vitaly Lugovsky 2001-08-17 8:09 ` David Mentre 2001-08-17 16:54 ` Brian Rogoff 2001-08-20 9:12 ` David Mentre 2001-08-20 16:54 ` Jun P. FURUSE 2001-08-18 23:15 ` John Max Skaller 2001-08-19 0:24 ` John Gerard Malecki 2001-08-19 18:18 ` [Caml-list] Ocaml ffi release 1.2.2 dsfox [not found] ` <3B7C5A74.15141D95@maxtal.com.au> [not found] ` <qtlitfm6ajh.fsf@pochi.inria.fr> [not found] ` <3B7F0C99.FEDEAA86@maxtal.com.au> [not found] ` <qtl3d6m4vkt.fsf@pochi.inria.fr> 2001-08-21 19:20 ` [Caml-list] Segfault in a native code program using pm variants John Max Skaller 2001-08-23 8:54 ` Xavier Leroy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox