From: Kuba Ober <ober.14@osu.edu>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Re: Why OCaml sucks
Date: Wed, 14 May 2008 00:40:03 -0400 [thread overview]
Message-ID: <200805140040.04101.ober.14@osu.edu> (raw)
In-Reply-To: <4829A207.2030601@fischerventure.com>
On Tuesday 13 May 2008, Robert Fischer wrote:
> > The problem, as I understand it, is in writting parsers. Your standard
> > finite automata based regular expression library or lexical analyzer is
> > based, at it's heart, on a table lookup- you have a 2D array, whose size
> > is the number of input characters times the number of states. For ASCII
> > input, the number of possible input characters is small- 256 at most.
> > 256 input characters times hundreds of states isn't that big of a table-
> > we're looking at sizes in 10's of K- easily handlable even in the bad
> > old days of 64K segments. Even going to UTF-16 ups the number of input
> > characters from 256 to 65,536- and now a moderately large state machine
> > (hundreds of states) weighs in at tens of megabytes of table space.
> > And, of course, if you try to handle the entire 31-bit full unicode
> > point space, welcome to really large tables :-).
> >
> > The solution, I think, is to change the implementation of your finite
> > automata to use some data structure smarter than a flat 2D array, but
> > that's me.
>
> Yes. It is certainly possible to write slow code to solve this problem.
With "slow code" you could have been meaning two things:
1. Table lookup globally replaced by compares-and-jumps. The latter benefit
from branch prediction and speculative execution. So it's not slow anymore.
2. Table "compression" used, where a few compare-and-jumps remove
huge "unused" swaths of the table. By "unused" I meant "bomb out with an
internal error".
I think you're being silly. Stop it.
Cheers, Kuba
next prev parent reply other threads:[~2008-05-14 4:40 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-09 0:39 Jon Harrop
2008-05-09 1:11 ` [Caml-list] " Matthew William Cox
2008-05-09 5:10 ` [Caml-list] Re: Why OCaml **cks Jon Harrop
2008-05-09 4:45 ` [Caml-list] Re: Why OCaml sucks Arthur Chan
2008-05-09 5:09 ` Jon Harrop
2008-05-09 11:12 ` [Caml-list] Re: Why OCaml rocks Gerd Stolpmann
2008-05-09 11:58 ` Gabriel Kerneis
2008-05-09 12:10 ` Concurrency [was Re: [Caml-list] Re: Why OCaml rocks] Robert Fischer
2008-05-09 12:41 ` [Caml-list] Re: Why OCaml rocks Gerd Stolpmann
2008-05-09 12:49 ` David Teller
2008-05-09 18:10 ` Jon Harrop
2008-05-09 20:40 ` Gerd Stolpmann
2008-05-09 20:55 ` Berke Durak
2008-05-10 10:56 ` Gerd Stolpmann
2008-05-09 21:00 ` Till Varoquaux
2008-05-09 21:13 ` Berke Durak
2008-05-09 22:26 ` Richard Jones
2008-05-09 23:01 ` Berke Durak
2008-05-10 7:52 ` Richard Jones
2008-05-10 8:24 ` Berke Durak
2008-05-10 8:51 ` Richard Jones
2008-05-13 3:47 ` Jon Harrop
2008-05-09 22:25 ` David Teller
2008-05-09 22:57 ` Vincent Hanquez
2008-05-10 19:59 ` Jon Harrop
2008-05-10 21:39 ` Charles Forsyth
2008-05-11 3:58 ` Jon Harrop
2008-05-11 9:41 ` Charles Forsyth
2008-05-12 13:22 ` Richard Jones
2008-05-12 18:07 ` Jon Harrop
2008-05-12 20:05 ` Arthur Chan
2008-05-13 0:42 ` Gerd Stolpmann
2008-05-13 1:19 ` Jon Harrop
2008-05-13 2:03 ` Gerd Stolpmann
2008-05-13 3:13 ` Jon Harrop
2008-05-12 20:33 ` Arthur Chan
2008-05-12 21:22 ` Till Varoquaux
2008-05-09 13:00 ` [Caml-list] Re: Why OCaml sucks Ulf Wiger (TN/EAB)
2008-05-09 17:46 ` Jon Harrop
2008-05-09 18:17 ` Ulf Wiger (TN/EAB)
2008-05-10 1:29 ` Jon Harrop
2008-05-10 14:51 ` [Caml-list] Re: Why OCaml **cks Ulf Wiger (TN/EAB)
2008-05-10 18:19 ` Jon Harrop
2008-05-10 21:58 ` Ulf Wiger (TN/EAB)
2008-05-10 18:39 ` Mike Lin
2008-05-12 13:31 ` [Caml-list] Re: Why OCaml sucks Kuba Ober
2008-05-12 18:18 ` Jon Harrop
2008-05-12 13:13 ` Kuba Ober
2008-05-12 19:32 ` Arthur Chan
2008-05-09 6:31 ` Tom Primožič
2008-05-09 6:46 ` Elliott Oti
2008-05-09 7:53 ` Till Varoquaux
2008-05-09 7:45 ` Richard Jones
2008-05-09 8:10 ` Jon Harrop
2008-05-09 9:31 ` Richard Jones
2008-05-09 7:58 ` [Caml-list] Re: Why OCaml rocks David Teller
2008-05-09 10:29 ` Jon Harrop
2008-05-09 13:08 ` David Teller
2008-05-09 15:38 ` Jeff Polakow
2008-05-09 18:09 ` Jon Harrop
2008-05-09 20:36 ` Berke Durak
2008-05-09 22:34 ` Richard Jones
2008-05-14 13:44 ` Kuba Ober
2008-05-09 8:29 ` constructive criticism about Ocaml Ulf Wiger (TN/EAB)
2008-05-09 9:45 ` [Caml-list] Re: Why OCaml sucks Vincent Hanquez
2008-05-09 10:23 ` [Caml-list] Re: Why OCaml **cks Jon Harrop
2008-05-09 22:01 ` Vincent Hanquez
2008-05-09 22:23 ` David Teller
2008-05-10 8:36 ` Christophe TROESTLER
2008-05-10 9:18 ` Vincent Hanquez
2008-05-09 11:37 ` [Caml-list] Re: Why OCaml sucks Ralph Douglass
2008-05-09 13:02 ` [Caml-list] Re: Why OCaml rocks David Teller
2008-05-09 12:33 ` not all functional languages lack parallelism Ulf Wiger (TN/EAB)
2008-05-09 18:10 ` Jon Harrop
2008-05-09 20:26 ` Ulf Wiger (TN/EAB)
2008-05-12 12:54 ` [Caml-list] Re: Why OCaml sucks Kuba Ober
2008-05-12 14:16 ` Jon Harrop
2008-05-13 13:33 ` Kuba Ober
2008-05-13 13:49 ` Robert Fischer
2008-05-13 14:01 ` Brian Hurt
2008-05-13 14:13 ` Robert Fischer
2008-05-13 15:18 ` Berke Durak
2008-05-14 4:40 ` Kuba Ober [this message]
2008-05-13 14:25 ` Gerd Stolpmann
2008-05-14 4:29 ` Kuba Ober
2008-05-12 13:01 ` Kuba Ober
2008-05-12 19:18 ` Arthur Chan
2008-05-12 19:41 ` Karl Zilles
2008-05-13 13:17 ` Kuba Ober
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200805140040.04101.ober.14@osu.edu \
--to=ober.14@osu.edu \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox