From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id JAA22797 for caml-redistribution; Wed, 31 Mar 1999 09:12:19 +0200 (MET DST)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id IAA32732 for <caml-list@pauillac.inria.fr>; Wed, 31 Mar 1999 08:18:42 +0200 (MET DST)
Received: from zarya.maya.com (zarya.maya.com [192.70.254.128])
	by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id IAA15921
	for <caml-list@inria.fr>; Wed, 31 Mar 1999 08:18:40 +0200 (MET DST)
Received: (from prevost@localhost)
	by zarya.maya.com (8.8.7/8.8.7) id BAA30109;
	Wed, 31 Mar 1999 01:19:52 -0500
Sender: weis
To: caml-list@inria.fr
Subject: Strange hugeness of .o, .cmo, and .cmi files
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
From: John Prevost <prevost@maya.com>
Date: 31 Mar 1999 01:19:50 -0500
Message-ID: <ya2soami3jt.fsf@zarya.maya.com>
User-Agent: Gnus/5.07008 (Pterodactyl Gnus v0.80) XEmacs/21.0(beta67) (20 minutes to Nikko)


I've been working on a curses library written entirely in O'Caml, and
recently copied my sources over a slow ppp link to my machine at home.
Quite by chance, I also got a copy of the binaries, and noticed the
sizes of the .cmo and .o files for one module in particular.

This module is not terribly complex (at least from my understanding),
but it is rather large in another sense.  I was wondering if anybody
could explain what's happening internally and how I could be more
efficient.

type entry = { term_names : string;
               string_table : string;
               booleans : bool array;
               numbers : int array;
               strings : int array }

type terminfo = {
    auto_left_margin : bool;
    auto_right_margin : bool;
[...461 lines similar, with some : int option and some : string options...]
}

[...small amount of code for reading terminfo files into entries...]

let process_entry e =
  let get_bool = get_bool e in
  let get_number = get_number e in
  let get_string = get_string e in

  { auto_left_margin               = get_bool 0;
    auto_right_margin              = get_bool 1;
[... another 461 lines ...]
  }

[...a few more little routines...]

Obviously, there are a lot of fields in the record type, and I assume that this is where the blowup is happening, but I'm not at all sure where.

Another module, with apparently comparable amounts of code, maybe a
bit more complex is tparm.ml.  Here's a comparison:

-rw-r--r--   1 prevost  users         721 Mar 31 01:07 tparm.cmi
-rw-r--r--   1 prevost  users        6400 Mar 31 01:00 tparm.cmo
-rw-r--r--   1 prevost  users         398 Mar 31 01:07 tparm.cmx
-rw-r--r--   1 prevost  users       15784 Mar 31 01:07 tparm.o

-rw-r--r--   1 prevost  users       59432 Mar 31 01:17 terminfo.cmi
-rw-r--r--   1 prevost  users       11396 Mar 31 01:00 terminfo.cmo
-rw-r--r--   1 prevost  users        1009 Mar 31 01:07 terminfo.cmx
-rw-r--r--   1 prevost  users      259076 Mar 31 01:07 terminfo.o

(compiling with -compact or without produces the same object size)

A bigger .cmi file was expected from the huge record, and I can deal
with, since it isn't used in every executable linked against it.

I'd expected a bigger object, but not 20x as big!

So, is there anything I can do to get around this, preferably without
needing to use -compact?  Would a different way of accessing fields be
better?  What if I went back to what I had before I decided a record
would be a good idea, which was using get_string entry some_field for
access?  (for example:

let auto_left_margin e = get_bool e 1

or

let auto_left_margin = get_bool 1

with a different formulation of get_bool.)


Since everything else uses the terminfo module, and since it gets
linked into the programs whole-hog, I'd very much like to cut down on
the space use.

Thanks,
John