Subject: What acts like an interupt problem...
To: None <netbsd-help@netbsd.org>
From: Matt Knopp <mhat@fundsxpress.com>
List: netbsd-help
Date: 11/24/1998 14:11:44
Okay.. To setup the problem:
I have two machines, the MB is by Asus (P2B), P2/350, 64M Ram, UDMA Disk,
4M Matrox Myst., a 10/100 Kingston (Dec chip), and a isa soundcard.
I have to use -current, because I need UVM. The BIOS configs for both machines
are exactly the same. One machine had 1.3.2 installed on it, the other has
1.3H (the 'a' snap I got yesterday). Previous to that it was -current from
a month or so ago. They are both using a 1.3I kernel at the moment.
The problem is that the 1.3.2 box is the only one that can ever see the net.
Well thats not exactly correct, the 1.3.2 box is the only one that can do
anything more then arp. If i try to ping something from the -current box,
it waits ~60secords and then is able to ping that site (like its waiting for
an interupt to clear). Oh, and did I mention that the -current machine was
previously working just fine, but "stopped" when we moved, and no, the
hardware isnt damaged. (I'll get to that.)
My logic on this probably seems a little bit broken at this point. However
just pretend that I'm not on crack.
So, starting about a month ago.. this problem came into being. Well I didnt
have time to deal with it, and I had installed 1.3.2 on a machine I was
borrowing from the office at home. So I brought that machine back and used
it as an xterm so i could work on some code.
Time passes, and we hire more people, so now getting these machines working
because an issue. The way the -current box behaves really seems like an IRQ
problem. So i make its irq's mirror those of the working machine. This doesnt
help. So I decided to see if the hardware is frazzled in the old machine by
swapping hdd's between the machine that "works" and the one that "doesnt".
Now the machine that "did'nt" works, and the one that did "doesnt". I decide
at this point its not the hardware.
I also noticed along various things that the 1.3H kernel from the "broken"
machine doesnt work on the "working on". That is to say if i boot the 1.3.2
machine with the 1.3H kernel - I get the same problem as I had on the other
machine. I think to myself "well, this isnt too bad.. maybe this -current
is broken somehow." and build 1.3I (from yesterdays src's). First thing I
try (because the machine quasi-works) is the 1.3I kernel on the 1.3.2 box.
Life is good - I can use the netcard. So i get the kernel over to 1.3-current
box, and it doesnt work. (At which point i became violently unhappy and went
to find something else to workon for a few hrs).
So, i'm looking for suggestions as to why two identicaly boxes, using the same
ports/irq's for everything (dermined by compairing /kern/msgbufs) running
the same kernerl would not both work. The only difference at this point that
is see is one has 1.3.2 binaries and the other has -current binaries. Argh.
-Matt Knopp
mhat@fundsxpress.com