Subject: Re: bin/8681: grep may bomb out with "memory exhausted"
To: David Brownlee <abs@mono.org>
From: Simon Burge <simonb@netbsd.org>
List: netbsd-bugs
Date: 10/28/1999 10:39:38
David Brownlee wrote:

> On Wed, 27 Oct 1999, Todd Whitesel wrote:
> 
> 	As I understand it this is only used for grep as memory to store
> 	a single newline when searching through files.

This is correct.

> 	Arguably any file with several megabytes of data without a newline
> 	is pathalogical (in a text sense), though a grep failing to find
> 	a string in a binary file because it was split across an arbitrary
> 	block boundary is a bad thing...

Unfortunately, some debugging kernels seem to have this behaviour.  From
my reading of the code, it's only if the pattern spans the buffer that
you'll get a problem.  With my suggested patch, how many regex's span
2MB plus at least that again?  If it does find a match, then you'll
"only" get 2MB + plus up to the current buffer (of 10MB) of the line,
not the complete line :-)

This is what the maintainer of grep has to say:

	GNU grep needs to buffer a line of input in main memory.  If your
	input lines are too long to fit into your main memory, then the right
	thing to do is to get more memory.  See the section `Memory Usage' in
	the GNU coding standards for more.

Personally, I'd be happy to put the size limit in and put a note in the
man page stating "you may have problems if you are searching for a regex
that is greater than 4MB, or if you want to see a line greater than 2MB
long that has a match".  This doesn't seem unreasonable to me...

Simon.