Subject: What acts like an interupt problem...
To: None <netbsd-help@netbsd.org>
From: Matt Knopp <mhat@fundsxpress.com>
List: netbsd-help
Date: 11/24/1998 14:11:44
Okay.. To setup the problem: 

I have two machines, the MB is by Asus (P2B), P2/350, 64M Ram, UDMA Disk, 
4M Matrox Myst., a 10/100 Kingston (Dec chip), and a isa soundcard. 

I have to use -current, because I need UVM. The BIOS configs for both machines
are exactly the same. One machine had 1.3.2 installed on it, the other has 
1.3H (the 'a' snap I got yesterday). Previous to that it was -current from 
a month or so ago. They are both using a 1.3I kernel at the moment. 

The problem is that the 1.3.2 box is the only one that can ever see the net. 
Well thats not exactly correct, the 1.3.2 box is the only one that can do 
anything more then arp. If i try to ping something from the -current box, 
it waits ~60secords and then is able to ping that site (like its waiting for
an interupt to clear). Oh, and did I mention that the -current machine was
previously working just fine, but "stopped" when we moved, and no, the 
hardware isnt damaged. (I'll get to that.) 

My logic on this probably seems a little bit broken at this point. However 
just pretend that I'm not on crack. 

So, starting about a month ago.. this problem came into being. Well I didnt 
have time to deal with it, and I had installed 1.3.2 on a machine I was 
borrowing from the office at home. So I brought that machine back and used 
it as an xterm so i could work on some code.  

Time passes, and we hire more people, so now getting these machines working 
because an issue. The way the -current box behaves really seems like an IRQ
problem. So i make its irq's mirror those of the working machine. This doesnt
help. So I decided to see if the hardware is frazzled in the old machine by 
swapping hdd's between the machine that "works" and the one that "doesnt". 
Now the machine that "did'nt" works, and the one that did "doesnt". I decide
at this point its not the hardware. 

I also noticed along various things that the 1.3H kernel from the "broken" 
machine doesnt work on the "working on". That is to say if i boot the 1.3.2
machine with the 1.3H kernel - I get the same problem as I had on the other 
machine.  I think to myself "well, this isnt too bad.. maybe this -current 
is broken somehow." and build 1.3I (from yesterdays src's). First thing I 
try (because the machine quasi-works) is the 1.3I kernel on the 1.3.2 box. 
Life is good - I can use the netcard. So i get the kernel over to 1.3-current
box, and it doesnt work.  (At which point i became violently unhappy and went
to find something else to workon for a few hrs). 

So, i'm looking for suggestions as to why two identicaly boxes, using the same
ports/irq's for everything (dermined by compairing /kern/msgbufs) running 
the same kernerl would not both work. The only difference at this point that 
is see is one has 1.3.2 binaries and the other has -current binaries. Argh. 

-Matt Knopp
 mhat@fundsxpress.com