So given a large file and a line number, you want to: 1) extract that line from the file 2) produce an enum of all k-length slices of that line? 3) match each slice against your regexp set to produce a list/enum of substrings that match the regexps? Without reading the whole line into memory at once. I'm with Dimino on the right solution - just use a matcher that that works incrementally, feed it one byte at a time, and have it return a list of match offsets. Then work backwards from these endpoints to figure out which substrings you want. There shouldn't be a reason to use substrings (0,k-1) and (1,k) - it should suffice to use (0,k-1) and (k,2k-1) with an incremental matching routine. E. On Fri, Mar 16, 2012 at 10:48 AM, Philippe Veber wrote: > Thank you Edgar for your answer (and also Christophe). It seems my > question was a bit misleading: actually I target a subset of regexps whose > matching is really trivial, so this is no worry for me. I was more > interested in how accessing a large line in a file by chunks of fixed > length k. For instance how to build a [Substring.t Enum.t] from some line > in a file, without building the whole line in memory. This enum would yield > the substrings (0,k-1), (1,k), (2,k+1), etc ... without doing too many > string copy/concat operations. I think I can do it myself but I'm not too > confident regarding good practices on buffered reads of files. Maybe there > are some good examples in Batteries? > > Thanks again, > ph. > > >