>> I have, however, another proposition : if you allow markup areas not be
>> well nested, then you can simply have an environment recording, for each
>> style, whether it is currently in use or not.
>
> In the old version, the top-most parsing layer (generated via Menhir)
> would
> only see tokens such as BEGIN_BOLD/END_BOLD.  There was an intermediate
> layer between the lexer and the parser which had a simple state machine
> that translated raw BOLD tokens into the BEGIN_BOLD/END_BOLD tokens.
> I'm now trying to minimise the "magic" in the intermediate layer, which
> is why I wondered if there was an elegant pure Menhir solution.
>

I see. Even if you could find a way to tell menhir what you want,
AFAICT, the size of the automaton will grow exponentially with the
number of kinds of tags, so this does not seems to be a good idea.

I think that you can either do it by hand just before parsing (as you
do) or just after parsing (as Sebastien proposed).

-- 
JH

> Best regards,
> Dario Teixeira