Subject: Re: Strange problem with bridge(4) on -CURRENT/sparc64
To: None <port-sparc64@netbsd.org, tech-net@netbsd.org>
From: Jed Davis <jldavis+netbsdlist@cs.oberlin.edu>
List: tech-net
Date: 02/01/2003 09:06:44
On Sat, Feb 01, 2003 at 11:02:53AM +0100, Martin Husemann wrote:
> On Sat, Feb 01, 2003 at 04:51:13AM -0500, Jed Davis wrote:
> > Problem 2:  So why not bridge the qe's together and see what happens
> > there?
>
> Are all the bridged interfaces SIMPLEX?  I had lossage with hme's that
> turned out to be realy !SIMPLEX if running half-duplex. if_bridge.c
> only works with SIMPLEX interfaces, I think I have a PR open on
> that...

Here it is: kern/18035.  Anyway, all of the interfaces involved claim to
be SIMPLEX, but they're also all 10Base-T half-duplex, since that's all
le and qe can do.

Oh, and things have just started behaving oddly again; the only
interfaces on the bridge are two qe's, both of whose links were active
at the time. 

And the odd behavior of terminals seems to affect only zsh -- it's
almost as if they're acting line-buffered when they shouldn't be --
while sh -o emacs is fine.

Meanwhile, uptime's load averages and top's view of the process table
seem frozen, and "iostat -w 1" prints out one line (the same line, no
matter when I run it):

      tty            sd0             cpu
 tin tout  KB/t t/s MB/s  us ni sy in id
   0    1  4.09   1 0.00   0  0  0  1 99

And then stops, instead of printing a line every second.  lsof gives me this:

lsof: can't read process table: proc size mismatch (73600 total, 1128 chunks)

fstat works, but give me output like this:

jdev     fstat      19618   wd -         -        none    -
jdev     fstat      19618    0 -         -        none    -
jdev     fstat      19618    1* pipe 0x1ec0d10 -> 0xffffffffffffffff w
jdev     fstat      19618    2 -         -        none    -
jdev     fstat      19618    3 -         -        none    -
jdev     fstat      19618    4 -         -        none    -

ps's view of the world seems correct, at least.  Meanwhile, my attempt
to pkgsrc update my zsh looks to be hanging on "checking if named FIFOs work..."
Oh, here we go, from the dmesg:

pid 18651 (conftest), uid 8472: exited on signal 11 (core dumped)

Except I don't see a core file for it, and I have the core size
unlimited.  There is another instance of that conftest hanging around,
though.  

And yet despite all this, the bridge is working fine; I've been typing
this message though it, in fact.  I think I'll leave the box like this
for a while and see what happens, although I've just nuked my shell of
choice and can't reinstall it now.

--Jed

-- 
<?xml version="1.0"?>  <?xml-stylesheet href="http://panix.com/~jdev/xs/txt.xsl"
type="text/xsl"?>   <sig name="Jed Davis">  <id dom="oberlin.edu" lp="sjld8197">
Student, 4th-Year</id><id dom="cs.oberlin.edu" lp="jldavis">CS Major and Student
SysAdmin</id><id dom="panix.com" lp="jdev">Panixer</id> <q href="bin.q"/> </sig>