Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
* Re: automatic construction of mli files
@ 2000-07-27 19:04 Damien Doligez
  0 siblings, 0 replies; 16+ messages in thread
From: Damien Doligez @ 2000-07-27 19:04 UTC (permalink / raw)
  To: frouaix; +Cc: caml-list

>From: Francois Rouaix <frouaix@liquidmarket.com>

>This has to be one of the most cryptic comment ever made to this list.
>And a "rather complex issue" coming from Damien, the mind boggles, especially 
>on this mysterious 8% figure.

>Care to give some details ?


OK.  It has to do with examining the roots at the beginning of each
minor collection.  The global variables are roots, but each global
variable is assigned only once, so we only need to examine it once
(after that, the value will be in the major heap, so it is not a root
for the minor collector).

We do it by remembering which modules have executed some
initialisation code since the last collection, and only examining
their globals.

The 8% figure comes from the speedup on Coq that we got when we
implemented the trick.


Actually, now that you force me to remember the complex part, I want
to take back my comment.  It's the fact that only the exported symbols
are roots, so using .mli files will speed up the first garbage
collections, and only by a small amount.  Using an empty .mli
file for your main module (the one that's linked last) will speed up
all garbage collections (because the initialisation of that module is
only complete when the program stops running), again by a very small
amount.

I have to apologize for not checking my facts before I posted to the
list.


Oh, and this applies only to programs compiled with ocamlopt.

-- Damien



^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: automatic construction of mli files
@ 2000-07-26 12:58 Damien Doligez
  2000-07-27 17:46 ` Francois Rouaix
  0 siblings, 1 reply; 16+ messages in thread
From: Damien Doligez @ 2000-07-26 12:58 UTC (permalink / raw)
  To: caml-list

>From: Jean-Christophe Filliatre <filliatr@csl.sri.com>

>In the  extreme situation where there  is no real need  for writing an
>interface, you can either simply not write one (this is not mandatory)
>or generate it from the code with "ocamlc -c -i".

There are two technical details you should all know concerning .mli
files:

1.  If you don't use .mli files, or if you generate them automatically
    from the corresponding .ml files, then you lose separate
    compilation: whenever you change a semicolon in foo.ml, all
    the files that depend on module Foo will have to be recompiled.
    This may or may not be a big problem depending on the size of your
    project.

2.  Due to rather complex implementation issues, if you don't use .mli
    files and let the compiler generate the .cmi from the .ml, then
    garbage collection will be slightly slower.  If you do it for all
    your files, you might lose as much as 8% on the speed of your
    program.

-- Damien



^ permalink raw reply	[flat|nested] 16+ messages in thread
* automatic construction of mli files
@ 2000-07-24  5:34 Julian Assange
  2000-07-24 20:48 ` Olivier Andrieu
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Julian Assange @ 2000-07-24  5:34 UTC (permalink / raw)
  To: caml-list; +Cc: proff


.mli files are highly redundant. Almost without exception all, or at
the vast majority of .mli information can be generated from the
underlying .ml implementation. We have programming languages to reduce
redundancy, not increase it. Keeping mli and ml files in-sync is not
only a waste of time, but error-prone and from my survey often not
performed correctly, particularly where consistency is not enforced by
the compiler (e.g comments describing functions and types). While
exactly the same problem exists in a number of other
separate-compilation language implementations, we, as camlers, should
strive for something better.

The mli case parallels the hideous task of maintaining C extern
definitions in .h files (the C++ case is usually even worse). This is
always the first thing to go in any C project I work on. Instead I use
a small cpp / sed script to automagically generates this information
from the underlying C implementation file.  This is quite simple to
use and merely involves placing the token "EXPORT" before the
variable/function definition. There is some minor added complexity to
support the full range of C compile-time variable
instantiation. Appended to this email is the part of my C style-guide
that describes this approach. Having greater control over the language
and compiler proper we should be able to do better, but the general
approach seems sound and applicable to ocaml.

GENEXTERN EXPORT MACROS
-----------------------

Redundant code is bad code. Unproto-typed code is bad code. Prototypes
and extern's are redundant by their very nature, and it's depressing
that people put up with the soul destroying action of manually
creating, updating (an exceptionally tedious and error-prone task) the
great swag of prototypes and externs that a C program of any size
needs for its various bits to communicate with each other. God didn't
give you a computer in order to further the evils of redundant
behaviour, but to eliminate it.

Examine the following conventional situation, where we have two C
files, each of which has variables and functions that the other calls --
I dont' recommend this way of parsing information about, but we need it
for the example :)

	== frazer.c ==

	bool CIA_support = TRUE;

	static int campaign_fund;
	static int frazer_dollars:
	static char *frazer_mental_state = "hopeful";

	void
	frazer(void)
	{
		frazer_dollars -= bribe_kerr(frazer_dollars);
		campaign_find -= frazer_dollars/2;
		if (dismiss_govenment &&
		    strcasecmp(dismiss_action, "care-taker"))
			frazer_mental_state = "hot doggarty dog";
	}

	== kerr.c ==

	bool dismiss_government;
	char *dismiss_action;

	#ifndef HAVE_STRCASECMP
	int strcasecmp (char *s, char *s2)
	{
		do
		{
			char c1=tolower(*s);
			char c2=tolower(*s2);
			if (c1>c2)
				return 1;
			if (c1<c2)
				return -1;
		} while (*s++ && *s2++);
		return 0;
	}
	#endif

	int
	kerr(int offer)
	{
		if (offer>KER_MIN_ACTION)
		{
			dismiss_government = TRUE;
			if (offer>KER_MIN_ACTION * 2)
				dismiss_action = "care-taker";
			else
				dismiss_action = "dissolution";
			return (CIA_support? offer/2: offer);
		}
		return offer/8;
	}

Now, lets look at the prototypes we will need to support these
shenanigans:

	== frazer.h ==
	extern bool CIA_support;
	void frazer(void);

	== kerr.h ==
	extern bool dismiss_government;
	extern char *dismiss_action;
	#ifndef HAVE_STRCASECMP
	int strcasecmp (char *s, char *s2);
	#endif
	int kerr(int offer);

In the marutukku build system this becomes:


	== frazer.c ==

	EXPORT bool CIA_support = TRUE;

	static int campaign_fund;
	static int frazer_dollars:
	static char *frazer_mental_state = "hopeful";

	EXPORT void frazer(void)
	{
		frazer_dollars -= bribe_kerr(frazer_dollars);
		campaign_find -= frazer_dollars/2;
		if (dismiss_government &&
		    strcasecmp(dismiss_action, "care-taker"))
			frazer_mental_state = "hot doggarty dog";
	}

	== kerr.c ==

	EXPORT bool dismiss_government;
	EXPORT char *dismiss_action;

	#ifndef HAVE_STRCASECMP
	EXPORT int strcasecmp (char *s, char *s2)
	{
		do
		{
			char c1=tolower(*s);
			char c2=tolower(*s2);
			if (c1>c2)
				return 1;
			if (c1<c2)
				return -1;
		} while (*s++ && *s2++);
		return 0;
	}
	#endif

	EXPORT int kerr(int offer)
	{
		if (offer>KER_MIN_ACTION)
		{
			dismiss_government = TRUE;
			if (offer>KER_MIN_ACTION * 2)
				dismiss_action = "care-taker";
			else
				dismiss_action = "dissolution";
			return (CIA_support? offer/2: offer);
		}
		return offer/8;
	}

EXPORT is merely the token genextern.sh uses for parsing cues,
although it's nice to see at a glance what is being referenced (or at
least, is meant to be referenced) from other .c files. Everything not
EXPORT'ed should be static. Can you see why?

Now, lets look at what has happened to frazer.h and kerr.h:

	== frazer.h ==
	#include "frazer.ext"

	== kerr.h ==
	#include "kerr.ext"

frazer.ext and kerr.ext are automatically generated by the
following rule in mk/rules.mk.in

    %.ext : %.c %.h $(top_srcdir)/config.h $(top_srcdir)/scripts/genextern.sh
            CPP="$(CPP)";export CPP; sh $(top_srcdir)/scripts/genextern.sh $<\
	    > $@.tmp $(DEFS) $(INCLUDES) $(CPPFLAGS) $(CFLAGS) \
	    && mv -f $@.tmp $@ || rm -f $@.tmp




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2000-08-03 13:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-07-27 19:04 automatic construction of mli files Damien Doligez
  -- strict thread matches above, loose matches on Subject: below --
2000-07-26 12:58 Damien Doligez
2000-07-27 17:46 ` Francois Rouaix
2000-07-24  5:34 Julian Assange
2000-07-24 20:48 ` Olivier Andrieu
2000-07-26 16:03   ` John Max Skaller
2000-07-24 22:02 ` Jean-Christophe Filliatre
2000-07-26 16:09   ` John Max Skaller
2000-07-24 22:09 ` John Prevost
2000-07-24 23:14 ` David Brown
2000-07-25  1:13 ` Jacques Garrigue
2000-08-01 11:22   ` Anton Moscal
2000-08-02 12:03     ` Dmitri Lomov
2000-08-02 14:13     ` Gerard Huet
2000-07-25 11:48 ` Hendrik Tews
2000-07-26 10:16 ` David Delahaye

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox