Subject: 2 new problems with 1.3.3 on SPARC
To: None <port-sparc@netbsd.org>
From: Greg Earle <earle@isolar.Tujunga.CA.US>
List: port-sparc
Date: 01/06/1999 05:57:51
Well, I finally made the big leap to 1.3.3 (from 1.2.1) on my work SS20/71.

I hit a snag during the upgrade because my /usr didn't have enough free
space in it.  (Had an AFS cache in /usr/vice taking up lotsa room.)  Suggest
that the install scripts check for sufficient disk space before doing the
un-tar'ing of the .tgz files, if that's possible.

I've hit a couple of problems in 1.3.3 that weren't in 1.2.1 so far.

(1) If I exit out of the TriTeal TED CDE I'm running, the machine crashes
    with a Watchdog Reset.  Oi!  This is with the GENERIC_SCSI3 kernel, btw.
    I haven't recompiled a new custom 1.3.3 kernel yet.  The only major
    difference between my 1.2.1 setup and now is that the 1.3.3 setup uses
    the X11R6.3 server et al. that's provided in our x*.tgz files for 1.3.3.

    Since it Watchdog Resets, there is no core dump for "savecore".
    I'll try to build a DDB-enabled kernel when I get a chance, but in the
    meantime, I'm amazed that quitting a user-level process could get the
    machine to reliably cause a Watchdog Reset.  (It's happened 3 times now,
    each time I tried to log out of TED/CDE.)

(2) I'm trying to do something like the following:

	foreach i ( machine1 machine2 machine3 )
	? echo '<root password>' | rsh $i "sudo sh /etc/init.d/syslog start"
	? end

    to restart some failed "syslogd" daemons on some Solaris 2.6 hosts.

    This has always worked fine for me in the past.  But now, it completes
    the first iteration, and then it stalls and I find the remote commands
    have turned into unreaped Zombies:

netbsd4me# ps -auxwtp2
USER       PID %CPU %MEM   VSZ  RSS TT  STAT STARTED       TIME COMMAND
earle     2816  0.0  0.0     0    0 p2  Z+    5:03AM    0:00.00 (rsh)
earle     2817  0.0  0.0     0    0 p2  Z+    5:03AM    0:00.00 (rcmd)
earle      389  0.0  0.3   604  208 p2  Is    7:59PM    0:00.84 -usr/local/bin/tcsh 
earle     2813  0.0  0.4   132  220 p2  I+    4:50AM    0:00.06 rsh machine1 sudo sh -x /etc/init.d/syslog start 
earle     2815  0.0  0.3   356  160 p2  I+    4:50AM    0:00.03 rcmd machine1 sudo sh -x /etc/init.d/syslog start 

    The child processes have not gotten reaped for some reason.  Most commands
    run this way exit normally, so it seems to be something endemic to
    running "sh /etc/init.d/syslog start" on the remote side.  But this didn't
    used to happen with 1.2.1 as far as I know, and if I do a "sh -x" instead,
    it shows that the remote shell command runs "syslogd" and then does an
    "exit 0", as expected.  Yet the commands don't return, and I have to hit
    ^C (twice!) to get the shell to skip to the next entry in the loop.

Any ideas on either problem?

(On a completely different note, does anyone have a cheat sheet on how to set
 up a dial-in/dial-out modem arrangement with "pppd" under NetBSD/SPARC 1.3.3?)

Thanks in advance,

	- Greg