Subject: death of a kernel...
To: None <port-mac68K@NetBSD.ORG>
From: Erik Vogan <vogan@auriga.rose.brandeis.edu>
List: port-mac68k
Date: 09/01/1995 10:41:28
Hello all,

	Sorry for the non-descriptive title, but this is to be one of my 
more rambling posts.  Apologies in advance.  System hardware = Mac IIx, 8 
MB mem, internal Qauntum 730 MB HD, external Quantum 340 MD HD - both 
SCSI II, and an NEC CDR 73.  This is all I think is relavent at the 
moment - I'm sure someone will let me know if I'm mistaken.
	Three sordid problems really.  All the stories start ...
I gathered tar files for -current on the 29th, compiled on the 30th, 
booted into the -current that night, and the new kernel died with an 
illegal instruction.  So far just the run-of-the-mill-first-attempt- 
at-'-current'-in-a-month situation.

   ***	Story 1) I rebooted the machine to BSD last night (using the original 
1.0B kernel), and was rewarded with a login prompt.  After logging in 
and starting X, the server dies (leaving Xmacbsd.core and xmodmap.core), 
so I moved around .rc files, and then the system paniced on an 
unimplemented trap.  I'd provide more details, but this would get to be a 
truly burdensome message, and they are really unimportant for the final 
result.
	Two reboots later, after having seen an illegal instruction panic 
after starting file system checks, and a hang after printing the symbol 
table size, I decided to shut the thing down compleatly and eat some 
dinner.  After a power on the thing spewed some sort of panic after 
probing the cd.  I was numb to them at this point so I don't 
remember what it was.  Resorting to a truly random solution in the face of 
truly random panics and hangs, I reboot to macos and did a mkfs on the 
swap partition (please don't ask me why I did this !).  It worked - sort 
of.  Any one know why ???  I already know some of the why nots.  Read on.

   ***	Story 2) After rebooting I logged in and attempted to startx 
again.  It died (a whole bunch of times in fact) and at some point left 
the system in a state where users with passwords were unable to login.  
I've seen this one before, so I rebooted to a single user mode and 
rebuilt the devices.  I've only seen it once before, so I decided it 
wasn't worth figuring out which device had it's owner/permissions munged.  
[ I figured I'd mention this problem so that when someone else sees it 
they will know the solution.  It took me a while to figure out the first 
time !!! ]  On exit to multi-user everything was hunky-dorey, except...

   ***	Story 3) The system seems to not have enough memory to execute 
code.  At various points during my tweakings I'd have tcsh die, startx 
die, and init die (the system didn't like that AT ALL).
[ all programs I mentioned acted the same - they just quit, leaving a 
core file, without giving any reason for the exit condition ! ]
	Things like init dying required reboots, tcsh dying got old - so I 
rebooted, and then the system seemed to stabilize.  Now I can 
consistently startx (with no .rc files around), but when I try to bring 
up fvwm, any key press causes the X server to shutdown down (with the 
message 'pipe broken or explicit server kill'), leaving a core file.  
I've checked and adb and all the ttys (and ptys) exist.  ttye0 might or 
might not exist, I can't really remember from last nigh, but I'm pretty 
sure the system would not get as far as it did if ttye0 didn't exist.  I've 
rebuilt devices (with MAKEDEV std) several times, and none of this has 
helped.


	ANY IDEAS ??  I've got some vmstat info (6000-7000 active pages 
with 200-300 free pages in multi-user mode with one user logged in and 
just a tcsh running), but I'm not sure it's out of the ordinary.  I 
don't know what's broken, and I don't know how to fix it (IOWs, I'm a 
loser).  I'd like to get the system back to functional so I can track 
down my kernel bug.  Please help me.
	Sorry for the long, rambling post.  Please ask for more info if I 
was unclear of minimized something you think might be helpful.


							erik vogan

~ Of all the races in all of the Galaxy who could have come and said a big
hello to the planet Earth, he thought, didn't it just have to be the Vogons.
							- HHGttG