Subject: Re: rarp failing
To: Chris G Demetriou <Chris_G_Demetriou@ux2.sp.cs.cmu.edu>
From: VaX#n8 <vax@linkdead.paranoia.com>
List: tech-net
Date: 12/04/1996 07:15:51
In message <20584.846329228@ux2.sp.cs.cmu.edu>, Chris G Demetriou writes:
>> So is ANYBODY able to netboot?  My Sun 3/60s aren't getting replies back,
>> but tcpdump shows the replies are sent.

>I'm using current versions of rarpd and bootparamd on NetBSD/Alpha to
>netboot other NetBSD/Alpha machines.
>...
>You say that tcpdump shows the replies on the net but your Sun 3/60's
>aren't seeing them... that sounds more like a bug in either your
>configuration or in the sun3 network boot blocks.

Actually, I just got some more data points.
After a struggle I got a third machine up (floppy based) to do tcpdumps.
I also instrumented my rarpd code to show the write sizes in rarp_reply.
rarp_reply is writing 42-byte packets, as it should, to the bpf fd.
However, when they appear on the wire, the rarp replies are
four bytes longer than the requests.  That is, the rarp replies are
64-bytes (not including the 4-byte ethernet trailer), and the
requests are 60 bytes (ethernet minimum, ignoring trailer).
Of course, if I run tcpdump on the local host it gives output
sizes of 42, which doesn't tell me much.

It appears that the packets are padded with junk.  The dump shows
the reply ending in:
7c4c a532 6da1 0700 0809 0a0b 0c0d 0e0f 1011 1213 1415

My guess is that somewhere between the write to the bpf and the
ethernet output routines it is getting hosed.  My first guess is
that it is probably bpf-related, perhaps bpfmovein(), as it seems
unlikely the problem could occur lower (say, in the etheroutput routines)
without being substantially more pervasive.

Perhaps 0x7c4cd6a10700 is the address of something that is getting
put in the wrong place.  Or the mbuf len/pktlen stuff is simply
getting corrupted.

It's repeatable, and annoying... grr.... spent all night....