Subject: Re: which CVS TAGS..
To: bru217 <bru217@yahoo.com>
From: Daniel Carosone <dan@geek.com.au>
List: tech-smp
Date: 06/01/2002 13:05:22
On Fri, May 31, 2002 at 05:13:24PM -0700, bru217 wrote:
> --- "Perry E. Metzger" <perry@wasabisystems.com> wrote:
> > 
> > bru217 <bru217@yahoo.com> writes:
> > > Would you recommend using NetBSD SMP in production (i386)?
> > 
> > The NetBSD autobuild server uses it, and I know of other people
> > using it in production, but it is certainly not something for the >
> > faint of heart. It is best used right now if you've got good technical
> > people involved.

I have to reinforce what Perry has said here, lest my later advice
be seen as encouragement.

I have been giving i386 smp some pretty solid workout (mostly,
rebuilding itself and other machines regularly, and a lot of pkgsrc
stuff) for quite some time, with generally good results. Although
it has seemed reasonably reliable, let me emphasise that I have
been in no way relying on it for anything at all.  On the occasions
when something has gone wrong, I'm quite happy to just set it aside
and do somethine else.

The above applies to running -current in production under the best
of circumstances, with or without SMP integrated.

> I'm pretty much by myself here, other guys don't use BSD, but I can
> handle it.

The only difference from an operational point of view is which
kernel you are running, once it is running. There are some caveats
to getting and keeping it running, though:

  - Updates to the branch occur somewhat asynchronously to changes
    on the mainline.  People making wholesale changes to other
    parts of the kernel infrastructure aren't obliged to make the
    corresponding changes on the branch, so there are times when
    you can't build from the SMP branch. You need to be careful
    about updates, and will want to keep both an SP and an MP
    version of your kernel available at all times, as well as
    backups of source trees before updates..

  - There are some things that don't play nice with SMP as yet,
    though they're being crossed off one by one, especially lately.
    You can't use RAIDFRAME or APM, for instance.

  - You can't share interrupts from devices at different ipl's,
    so if you have a lot of devices or an uncooperative bios that
    wants to make them share, you may have issues.  On the SMP
    machines I run, this basically means I can't use USB or audio;
    no loss for me but YMMV.

  - You need to keep flipping /sys/arch/i386 between the branch
    and -current, depending on whether you're building kernels or
    userland. Actually, building userland from SMP sys headers
    seems to work at the moment, I discovered by accident recently,
    but I'm not sure if it's "finished" (it may just compile but
    not work right).

  - Don't expect to get twice the performance of an SP kernel; many
    operations are still locked against all processors, and there's
    extra debugging and diagnostic code in the path when those
    locks happen, too. Don't consider disabling that, really; you
    want to know if something's wrong, even if its with a panic,
    rather than corrupt data or otherwise lose mysteriously.

If you feel confident running NetBSD -current with the above caveats
and with a potentially unstable kernel, smp or otherwise, then go
for it.

> Is there a way to pull it now? What would be the best way? Once I get
> this going I will most likely even have a test box to play with so I
> can mess things up, but Monday is pretty critical.

The first thing you need is a working i386 NetBSD-current machine.
Get all the base stuff up and running using the most recent -current
snapshot, pull -current sources from anoncvs, get yourself a kernel
config that suits your needs and hardware, and rebuild the world.

Then you want to make a copy of the sys/arch/i386 tree from the current
branch somewhere "safe".  You will be flipping back and forth, and
will want to be cvs updating both at the same time from the respective
branches/HEAD).

In the version of the arch/i386 that will be the MP tree, you want to:

 cvs update -r sommerfeld_i386mp_1 -dP .

Put this in place as /sys/arch/i386, modify your SP kernel config
to include the differences between the mainline GENERIC and branch
GENERIC configs, and build yourself an SMP kernel. Boot it.

Test the hell out of it. Build a lot of pkgsrc stuff, rebuild the
world, thrash your hardware, and run your applications. You're
looking for problems not just with the code but also with your
hardware while running SMP. At the first sign of trouble, try to
replicate the problem on an SP kernel.

For a production server, consider having the kernel that boots
automatically be the SP one, and manually boot MP kernels. If it
panics when you're not looking, you get a probably-more-stable SP
kernel after the reboot and maybe avoid repeated crashes.

Good luck.

--
Dan.