Subject: Re: Swapping problems (was: Re: 1.2 features, again)
To: Jukka Marin <jmarin@pyy.jmp.fi>
From: John S. Dyson <toor@dyson.iquest.net>
List: current-users
Date: 06/28/1996 13:49:47
> 
> > Try FreeBSD, if you are running on X86 machines.  It's VM perf is better
> > than Linux in most respects, and in -current blows it away.  Also, FreeBSD
> > really works under load.
> 
> Well, just heard from an ISP that they're going to replace their freebsd
> news server with an l***x one because the current system is not stable under
> high loads.  With 64 MB of RAM, NetBSD seems to be doing reasonably well as
> a news server, except that the disk IO is slow because the system uses so
> little RAM for the disk cache.
> 
Here I am NOT speaking officially as a core team member, but as a contributor to
the FreeBSD effort:

During this discussion, remember -current IS NOT RELEASE CODE, or even
guaranteed to boot.  -stable has a better chance of working, but once
in a while might not.  Released code is the only code that we say
works.  It is good when people get to try out and test -current, but
the problem is that if people use it in production, then a bug that should
have been fixed correctly gets fixed quickly.  Then we have to go back
later and fix it correctly.  It is not good for anyone to use -current
in production (unless that user takes FULL responsibility for bugfixes,
etc.)

At work, we use a tried and true version of -stable, resisting the temptation
using -current on our main machines.  They even have me there, and if we can
reproduce the problem, I can fix it, or get someone's attention to fix it.  But,
we DONT USE -current for production machines or machines that we cannot allow
to fail.

There is a person that I know of (in the .fi domain) that doesn't seem to be
able to run the -current system very well at all.  (This person is running
-current in a production environment.)  There is something that he appears
to do very differently that we have never been able to "figure out" yet.  OTOH,
We have lots of people who use the system with few, if any, problems.


BACKGROUND ON THE RECENT INSTABILITIES IN -current and for a few days in
-stable:

-current went through some instablilities (due to errors in the (my) pmap
rewrite, and some major bugs associated with improvements in the upper level
VM code), and there were/are some seriously bad assumptions made in the pmap
code from the 386BSD days that we have discovered recently.  We got bit AGAIN
in -stable, by a mistake that I made, with the paging problem similar to what
NetBSD has.  Essentially, the 386BSD pmap code, has some very interesting
anomolies, and I suggest ignoring the code and rewriting it from scratch.
Fixing the problem correctly might cause NetBSD -current or whatever to become
more unstable while the kinks are worked out, but the benefits will outweigh
the cost (eventually.)  The best way to solve the problem is to re-architect
(not just improve or polish) the pmap code, and things will get much better
(we have already done that in -current.)  The upper level VM code will also
perform better if the pmap problems are fixed.  (That isn't to say that the
original VM upper level code doesn't have some problems also that have been
fixed in FreeBSD.)

We have spent alot of effort on 2.1.X with some of our best people, but
2.2 is the future.  I think that there will soon be an effort to
stabilize 2.2, and make a solid copy of it ready for prime time.  We will
do this, because alot of people just don't have the time to track -current
to find a good one.  Certainly a clean/stable -current snapshot would be in
order, so that "aggressive" ISPs will be able to really try the code out...

Right now, -current on FreeBSD is stabilizing again.  It can handle
tremendous loads today (passing fork bombs, the mallin test from the
Linux camp -- of course crashes Linux, and severe swapping
and paging loads.)  Frankly, FreeBSD can handle loads still giving
high performance, when other systems will thrash, giving almost no
performance.  It is likely that many people who have run into problems, have
pushed the FreeBSD system much harder than other systems can even function.
There have been problems under extreme load specifically with NFS (with
bugs (timing races) that I think exist in every version of *BSD), and the
recent VM changes.  The VM changes are pretty much settled out (again),
and we wont be doing anything but minor clean-ups, or minor features (like the
pageout daemon algorithm selection),  until we get a really good -current
snapshot ready.  Our release of 2.1.5 is nearly ready, and will likely be the
last of it's line.  The -current tree will be taking all of our attention.

All of this is IMO...
John