The upcoming version of GtkSourceView, the library used by many Gnome text editors for syntax highlighting, supports new and more powerful language parsing. I volunteer to update the language definition for Ocaml, and would like some feedback from the community regarding useful things to highlight. The following are currently matched: ==================================== * Comments (* *), and within them, email addresses, net addresses and TODO/FIXME/XXX * Decimal, Octal, Hex, Binary and Floating point literals * Labeled and Optional function arguments * Polymorphic variants and normal Variant constructors * Module paths (as a prefix to anything) * Strings and Character Literals, with all escape codes allowed within them sub-matched The reasonably large list of keywords has been broken into four sections. (I encourage comments on this division.) 1) booleans true false 2) flow control & common keywords and assert begin do done downto else end for fun function if in let match rec then to try val when while with 3) types, objects & modules as class constraint exception external functor include inherit initializer method module mutable new object of open private struct sig type virtual 4) function-like keywords asr land lazy lor lsl lsr lxor mod or Things not matched currently ============================ * Line number directives (probably never seen in actual code) * Record constructors - { record with label:value; label:value } * Object duplication - {< var = value; var = value >} * List literals - [ elem1; elem2; elem3 ] * Array literals - [| elem1; elem2; elem3 |] * Tuples - elem1, elem2, elem3 (hard to parse - no parentheses needed, only commas) * Array access and modification - arr.(i), arr.(i) <- 5 * String access and modification - str.[i], str[i] <- 'w' * Coercion - ( expr :> type ), (expr : type1 :> type2) * Method calls - obj#method args * There's a ton of character-sequence keywords, are there any that should be handled as a keyword, or should they be handled only in the above cases where they're used? != # & && ' ( ) * + , - -. -> . .. : :: := :> ; ;; < <- = > >] >} ? ?? [ [< [> [| ] _ ` { {< | |] } ~ * What about camlp4 keywords? New keywords in the new camlp4? parser << <: >> $ $$ $: Thanks for your comments. Attached is a compressed version of my current language definition in the XML format required by gtksourceview. E.