From: Brian Hurt <bhurt@spnz.org>
To: John J Lee <jjl@pobox.com>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Executable size?
Date: Wed, 12 Nov 2003 14:40:56 -0600 (CST) [thread overview]
Message-ID: <Pine.LNX.4.44.0311121406060.5009-100000@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.58.0311121837540.2472@alice>
On Wed, 12 Nov 2003, John J Lee wrote:
> On Wed, 12 Nov 2003, Brian Hurt wrote:
>
> > On Wed, 12 Nov 2003, Richard Jones wrote:
> [...]
> > > This is not a criticism of OCaml, but the executables do tend to be
> > > quite large. This seems mainly down to the fact that OCaml links the
> > > runtime library in statically. There was previous discussion on this
> [...]
> > This isn't as bad as it sounds. A simplistic "hello world!" application
> > in Ocaml weighs in at 112K, versus 11K for the equivelent (dynamically
> > linked) C program- almost entirely either statically linked standard
> > libraries and infrastructure (garbage collections, etc.)- stuff that
> > doesn't expand with larger programs.
>
> OK. Is that 100K difference for "hello world" (which doesn't necessarily
> stay the same for larger programs, as you say below) simply a result of
> the fact that C has the "unfair" advantage of already having its runtime
> sitting on everyone's hard drive already?
Actually, I think Ocaml uses C's runtime libraries and builds on top of
them. For example, if I understand things correctly, Ocaml's printf is a
wrapper which calls C's printf. Which is why I haven't bothered comparing
Ocaml's size to C programs being statically linked. Ocaml is at least
nice enough to only link libraries you are actually using (see the
print_string v. printf results).
In addition to a more complicated and complete standard library and
bultins, Ocaml also has garbage collection, which is non-trivial to
implement. I wouldn't be surprised if half or more of that 100K of
overhead is just the GC. Currying, exceptions, etc. also have small size
penalties.
On the other hand, I would argue that these features, while bloating the
application. Which is exactly the sort of thing small "benchmark"
programs don't show. I don't know how many times I've read or written C
code like:
int copy_file(char * src, char * dst) {
char * buf;
FILE * inf;
FILE * outf;
if ((src == NULL) || (dst == NULL)) {
return EINVAL;
}
inf = fopen(src, "rb");
if (inf == NULL) {
return errno;
}
outf = fopen(dst, "wb");
if (outf == NULL) {
fclose(inf);
return errno;
}
buf = (char *) malloc(4096);
if (buf == NULL) {
fclose(outf);
fclose(inf);
return errno;
}
blah blah blah you get the idea
Vr.s the same code in Ocaml:
let copyfile src dst =
let inf = open_in_bin src
and outf = open_out_bin dst
and buf = String.make 4096 ' '
in
let rec loop () =
let c = input inf buf 0 4096 in
if (c > 0) then
begin
output outf buf 0 c;
loop ()
end
else
()
in
loop ()
The ocaml executable code for copyfile function will be smaller than the C
version, simply because the ocaml version takes advantage of various
features of the larger ocaml library and infrastructure- especially (in
this case) exceptions and garbage collection.
>
>
> > A naive assumption would be that an Ocaml program is about 100K or so
> > larger than the equivelent C program. Not much, considering how easy it
> > is to get executables multiple megabytes in size.
>
> [...]
> > Ocaml gets a lot more code reuse, and thus can actually lead to smaller
> > executables.
>
> I don't understand what you mean by that (probably my fault). What do you
> mean by "code reuse" here? I usually understand that phrase to mean using
> code written by people other than me, but you seem to mean it in a
> different sense.
>
I was using it in the most literal sense- using code more than once, in
more than one way. In general, it's much better to have only one copy of
a function, used in two places, than two copies of the function. The
trick is that generally the two copies are not exactly identical- if
the functions are, for example, the length of a linked list, one function
might operate on a linked list of integers, another a linked list of
floats. Ocaml encourages you to program in a generic way- you actually
have to work at it to write a linked list length routine that *isn't*
generic, the naive implementation is (so is the optimized version).
Again, this generally isn't a problem in small programs, which easily fit
into you brain as a whole. Code reuse becomes more of a trick on moderate
to large programs, especially moderate to large programs with more than
one programmer. How many times have we reimplemented linked lists in C?
>
> > Unless you have special constraints, the difference between C program
> > sizes and Ocaml program sizes are not enough to be worth worrying about.
>
> I don't really agree that the problem of distributing simple (few lines of
> code) applications in small executables is all that "special". Certainly
> there are *many* applications where you don't need that; equally, there
> are quite a few where you do need/want that.
I was thinking of special cases where the difference of a 100K or 1M or so
is the difference between working and not working. If you are, for
example, trying to fit your program on a 512K ROM, Ocaml's overhead might
be a problem.
--
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
- Gene Spafford
Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2003-11-12 19:44 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-11-12 16:14 John J Lee
2003-11-12 17:33 ` Richard Jones
2003-11-12 18:06 ` Dustin Sallings
2003-11-12 18:31 ` Sven Luther
2003-11-12 18:50 ` John J Lee
2003-11-13 9:10 ` Sven Luther
2003-11-13 13:46 ` John J Lee
2003-11-13 14:28 ` Sven Luther
2003-11-12 18:21 ` John J Lee
2003-11-12 22:53 ` Richard Jones
2003-11-12 23:50 ` John J Lee
2003-11-15 12:48 ` skaller
2003-11-15 15:25 ` John J Lee
2003-11-12 19:06 ` Brian Hurt
2003-11-12 18:38 ` Sven Luther
2003-11-12 19:04 ` Karl Zilles
2003-11-12 21:29 ` Brian Hurt
2003-11-12 20:03 ` Brian Hurt
2003-11-13 4:14 ` Kamil Shakirov
2003-11-13 9:06 ` Richard Jones
2003-11-13 9:18 ` Sven Luther
2003-11-12 18:46 ` John J Lee
2003-11-12 20:40 ` Brian Hurt [this message]
2003-11-12 20:10 ` Basile Starynkevitch
2003-11-12 20:35 ` John J Lee
2003-11-12 21:51 ` Brian Hurt
2003-11-12 21:35 ` David Brown
2003-11-12 22:12 ` Eric Dahlman
2003-11-12 23:32 ` Brian Hurt
2003-11-12 22:53 ` Eric Dahlman
2003-11-12 23:35 ` John J Lee
2003-11-12 23:44 ` John J Lee
2003-11-13 0:26 ` Karl Zilles
2003-11-13 1:29 ` [Caml-list] F-sharp (was: Executable size?) Oleg Trott
2003-11-14 6:04 ` [Caml-list] float_of_num Christophe Raffalli
2003-11-13 15:43 ` [Caml-list] Executable size? Eric Dahlman
2003-11-13 19:58 ` John J Lee
2003-11-13 20:36 ` Eric Dahlman
2003-11-13 22:16 ` John J Lee
2003-11-15 13:41 ` skaller
2003-11-15 15:13 ` John J Lee
2003-11-15 18:07 ` skaller
2003-11-15 13:36 ` skaller
2003-11-15 15:01 ` John J Lee
2003-11-15 17:53 ` skaller
2003-11-13 13:37 ` Florian Hars
2003-11-12 18:05 ` Dustin Sallings
2003-11-12 18:36 ` John J Lee
2003-11-12 19:04 ` Dustin Sallings
2003-11-12 20:17 ` John J Lee
2003-11-12 20:01 ` Vitaly Lugovsky
2003-11-13 1:23 ` Nicolas Cannasse
2003-11-15 12:09 ` skaller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.44.0311121406060.5009-100000@localhost.localdomain \
--to=bhurt@spnz.org \
--cc=caml-list@inria.fr \
--cc=jjl@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox