port-macppc: RC* concerns (was: overfilling mfs partitions larger than 600M

Subject: RC* concerns (was: overfilling mfs partitions larger than 600M
To: None <port-macppc@NetBSD.org>
From: Tim Kelly <hockey@dialectronics.com>
List: port-macppc
Date: 11/11/2004 06:38:44
On Wed, 10 Nov 2004 23:45:08 -0500
Chris Tribo <ctribo@college.dtcc.edu> wrote:

> Man, I think we need to have a bug squashing party before RC5 or 2.0
> is cut.

To be honest, I don't think RC4 is ready for prime time. I've been
finding things so fast I haven't had time to write up what's going on
with the last thing I found. I found this bug because I made progress on
the MP kernel panic thing.

Here are quick summaries of some show stoppers:

1) As far as I can tell, the default partition scheme during the
installer will set / to 32M and /usr to the rest. That means that /etc,
/var,/home, /root and a couple others will have a whole 32M to share. I
couldn't get userland to compile until I set /tmp and /var to
symbolically point to directories in /usr. Michael pointed out the
NetBSD Problem Report #22508. As of RC2, this doesn't appear to have
been fixed. It probably has been overlooked because most people don't
use the default partitioning scheme. However, the default scheme should
be an intelligent offering.

Michael and I have discussed a couple partition schemes that would put
/var, /home, /usr, /tmp, and /root on their own partitions. The goal
would be to make /var write-only with kern secure level 2, keep /root 
isolated, and for users' files to be separate from userland. While this
would make five partitions (including /), I think it ought to be offered
as an option. It really isn't good for /tmp and /var to be on the same
partition as /, as it prevents / from being read-only, and the other
directories should be isolated from all three.

Much of the installation process comes from 
/usr/src/distrib/miniroot/install.sh

If someone has some good shell script skills, this would be the place to
start for figuring how to add either a better default scheme or
alternative prepackaged partitioning schemes.

2) I figured out that the problem with the kernel misidentifying the
boot device appears to be in the manner in which the boot path is
specified in Open Firmware. For example,

boot scsi-int/@6:0,\ofwboot.xcf netbsd

is a perfectly valid path, even though I didn't specify "sd." This is
because OF will use the first child of scsi-int it finds, and sd will
always be found before st. However, once this has been through the
canonization routine in autoconf.c, the node has been dropped and the
canonized bootpath becomes (stuff)/sd@0. I hope to have a patch on this
soon.

3) The MP kernel panic appears to be related to how much memory is
physically present. After the above changes to /tmp and /var were made,
and after I installed an additional 128M (for 256M total), I was able to
compile userland after 21 1/2 hours. I removed the 128M and tried again,
and a kernel panic occured within in thirty minutes. Reinstall the stick
and I had been building userland for four hours or so before I kernel
panic'd due to the mfs issue. 

I had been attempting to force the active memory setting in top over
256M by consuming large amounts of memory. I noticed in reading posts
about the kernel panic that make and nbmake were almost always being
used at the time of the panic. I watched top very closely and it will
routinely show active memory of 140M or more during building userland
and likely other builds as well (but possibly not during building a
kernel). I believe that when the active memory exceeds the physically
installed memory, the kernel panics occur.

I believe this is because some memory associated with the IPI
(Interprocessor Interrupt) is being paged out of memory or back into
memory without being marked dirty, and one CPU is thinking it has sent
an IPI to the other CPU when it hasn't gotten out of its own cache, or
the other CPU thinks it has responded when that too hasn't gotten out
its cache.

During testing I did apply a patch to arch/powerpc/powerpc/pio_subr.S
that overrode the DBGSYNC #define for multiprocessor, so that any memory
access forces a sync instruction. This did not affect the problem, as it
it turns out that DBGSYNC is defined somewhere for almost all of the MP
kernel. I say almost all because the patch resulted in a kernel that was
64 bytes (16 instructions) larger, and that (possibly erroneously)
leads me to believe that there are an additional 16 syncs.

4) There's the mfs issue. I have tested mfs partitions up to 300M and
they don't seem to have the problem. I get file system full. I have
tried partitions from 600M and up, and get kernel panics.

5) Grackle still needs the patch applied that I posted. As far as I can
tell, without it PCI cards will not get their IRQs properly identified
on grackle equipped PowerMacs, which as I recall includes beige G3s.

6) X11 termination. Yes, it's been around for a long time, and it shows
up in some other OSes, but it needs to be fixed. Michael and I are
trying to get some Open Firmware based patching in, but we've been
tracking down the above stuff so much we haven't had time.

(end)

I think there needs to be a survey of how many people are using the
release candidates. John Klos started one a few months ago, and
it's time to bring it back up. Overall, with the serious, serious
thrashing I have been doing on my 7300 with RC4, I am damned impressed
with the quality of the kernel. It'd be really bad if this was
completely overlooked because of things that keep people from getting to
the kernel.

tim