Subject: Re: 1.3 broken
To: Jukka Marin <jmarin@pyy.jmp.fi>
From: Chris G. Demetriou <cgd@pa.dec.com>
List: port-i386
Date: 01/31/1998 13:23:24
> I'm trying to upgrade my main i386 machine to NetBSD 1.3 and am having
> several kinds of problems.  First I had problems with the installer program,
> but I got past them already.

Well, I can't help you much, but I do have a few suggestions/questions
which might help me or others on the list help you with your problems:


> Problems:
> 
> 1) reboot command doesn't reboot the system.  I see this on two different
>    pc's.  The machines usually hang after syslogd is killed, but num lock
>    key still works (num lock led blinks if I press num lock), so the machine
>    isn't completely dead.

put DDB into a kernel, break into DDB when the system gets into this
state, and see what the system is doing.

I've got N machines (3 x86 at home, 1 x86 at work, and a bunch of
non-x86 systems) which don't show symptoms like this, so my first
guess would be that it's related to your local configuration.

Are you using any of the 'weird' file systems (any of the code under
miscfs other than that in deadfs, genfs, or specfs)?  Are you mounting
file systems over NFS in such a way that after the 'shutdown' has
killed many processes you'd no longer be able to talk to your servers?
(I dunno what's going on, or even if this is in the right direction,
but that's certainly one way the system can get into a very losing
state...)


> 2) The IDE driver is definately broken.  One system A, it doesn't detect
>    the primary hard disk on the first IDE port if I have an ATAPI CD-ROM
>    attached to the same port.  On system B, I have an IDE disk on the primary
>    IDE interface and an ATAPI drive on the second one (as a slave).  Most
>    of the time, the kernel sees some imaginary wdc1 drive at the secondary
>    controller.  Then the kernel notices that the drive isn't working
>    properly and keeps retrying and the machine never comes up.

I don't really know what's going on here. (This sounds completely
different from the interrupt-related problems which have been
reported.)

If you can get one (I don't know that any are built and waiting to be
downloaded, and I can't easily create one), you might try a -current
boot floppy to see if it does any better.  The wdc probe code works a
bit better in -current.

I assume that all of your drives are correctly jumpered for master or
slave, as appropriate?

FWIW, i've installed virgin 1.3 on systems with IDE drives (and ATAPI
CD-ROM drives) and multiple IDE controllers, without a problem.
However, if you're having a problem, well, it's obviously not working
for you.  8-)


> 3) The com driver is unreliable.  My PPP connection to the world died
>    suddenly and pppd reported "serial link appears to be disconnected",
>    but the pppd process never exited like it should.  Instead, the serial
>    port locked up just as if no interrupts were no longer generated.
>    I verified this with kermit.

I've not had these problems on my one system which uses PPP, but that
could just be an anomaly.  (works bloody well for me, though...  the
machine in question's been up for a while, and has only gone down when
I screwed up the packet filter rules so badly that I had to reboot it.
~once a day the modem hangs up and correctly reconnects, but I put
that down to the phone lines and/or the isp. 8-)

This worked under 1.2?  What 'new features' does 1.3 trigger on your
machine (look at dmesg output).  Anything like isapnp?

"interrupts no longer generated" sounds like an interrupt conflict.


> 4) Why has the keymap changed so that I can no longer make the del/bs
>    keys work properly in mg?  I'm using pcvt.

I dunno; i use pccons.  sorry.  8-)



> 5) I can't seem to make the tcom serial driver work - it doesn't receive
>    any interrupts from the hardware.  I would appreciate it if someone
>    more knowledgeable could help me out.  My only Internet connection
>    depends on this driver.

"What is tcom?"  (Custom driver?)

Obviously, this is a driver recompile, but assuming all else works and
the driver seems to work otherwise, this could be an interrupt problem
as well...

What was the last version of the kernel that it worked with?  What
modifications had to be made to make it compile with 1.3?




cgd