Subject: Re: Replacement for grep(1) (part 2)
To: Matthew Dillon <dillon@apollo.backplane.com>
From: Nate Williams <nate@mt.sri.com>
List: tech-userlevel
Date: 07/14/1999 17:09:52
> :Most of the work we've done wouldn't allow this, especially if we were
> :using an OS like FreeBSD with a fairly long bootup time.  Especially if
> :it can be avoided.
> :
> :Yes, we could (and did) do our own memory management, but it seems to me
> :that the kernel has alot more information available to it and would do
> :it better than I could.  Then again, maybe I'm totally confused about
> :how the VM system 'does its thing', and in reality it's no better at it
> :than our code, just alot more complex for no reason. :) :) :)
> :
> 
>     The kernel just isn't the best place to do this.   The level of 
>     sophistication required to satisfy the larger set of programming
>     is much greater then anything discussed here.  The last thing that the
>     kernel should be doing is returning NULL from malloc() to a program
>     which is operating within its specifications.

You and I disagree on this statement.  Returning NULL is completely
acceptable, and is the 'specification' for many OS's, including FreeBSD
if you set user/process limits.

>     That doesn't help anyone, least of all the programmer who now has
>     to deal not only with the complexity of the project he is working
>     on, but must also deal with the potential that the OS will step in
>     at any time and give him an error that he must deal with in a
>     sophisticated fashion even when his software is operating
>     properly.

Returning NULL isn't an error, it's an indication that there is no more
memory.  Don't think if it as an error, think of it as a hint. 

General purpose computing tends to make one think that out-of-memory
condition is an un-acceptable situation, and the program must exit.
FreeBSD exacerbates this by rarely returning NULL and randomly killing
off processes which may/may not be involved in the memory fracus.

> 
>     The same goes for non-embedded work.  Why is it that programs generally
>     exit when they encounter a memory allocation failure?

Because programmer's are lazy.  Really, and I do it all the time as
well.  But, that's because on 'general purpose' computing hardware,
re-starting the process or having to wait for a reboot is 'acceptable'.

But, to many of my collegues who are running simulations that are
running for DAYS and WEEKS, they write code that 'saves the state' of
the system when they can't get critical memory, *THEN* abort the
application.

Yes, it's possible that the process of saving state *may* not work if
the system is *really* low on memory, but then again it may work.  But,
more times than not it *does* (appear) to work, and the work is not
lost.

>     with memory allocation errors" provides no solution.  98% of the source
>     code in the BSD code base will exit if a memory allocation fails, and
>     I don't know anyone who wants to go through and somehow "fix" it all 
>     (and one could argue what the "fix" should be when something like grep or
>     a shell has a memory allocation failure).  To require that this code be
>     made twice as sophisticated as it is already in order to deal with a
>     non-overcommit model reliably is a fantasy.

Who said anything about using the bsd code base in an embedded system?
Also, you *can* use limits for that stuff.



Nate