Subject: Re: /dev/tap and tcpdump don't go together very well? [conclusion & diff]
To: Quentin Garnier <cube@cubidou.net>
From: Rhialto <rhialto@falu.nl>
List: current-users
Date: 04/08/2007 23:28:03
On Sun 08 Apr 2007 at 22:39:27 +0200, Quentin Garnier wrote:
> I'm not sure it actually makes sense to do that.  The philosophy of
> tap(4) is to have a virtual Ethernet device.  If it is opened by a
> process, then said process is the backend of the interface;  it's up to
> it to tell whether it wants frames or not, I don't see why the kernel
> would have to outsmart it.  Granted, explicitely opening the device only
> for writing can be considered enough of a hint, but it still feels a bit
> hackish.

I agree that there should be a better way for unclaimed packets to be
dropped eventually, but I'm not sure what it would be. This was my first
idea. At least it avoids complexity like using timers etc.

> If you're going to use bpf(4) to read packets coming "out" the
> interface, then why not using bpf(4) to write "incoming" frames?

The problem with sending frames through bpf(4)[1] is that although they
get sent out on the wire, the local host does not see them.
Asymmetrically, packets that are sent by the local host *are* copied
into bpf(4) (although this is undocumentedly optional, but default on).
In a way this makes sense, since, being a packet filter it is more meant
to see packets that are on the wire and sending packets out does not
really fit in its philosophy. 

But this gives the weird effect that the emulator can successfully
communicate with the whole world, *execpt* the host it is running on,
and that in one direction only.

So to fill this 1/4 gap, I added some optional code to KLH to use tap(4)
when available. I might bite the bullet and add proper tap(4) +
bridge(4) code, but it would be nice ("orthogonal") if corner cases like
this would work too.

On the other hand, bpf(4) could get fixed too. I looked at it briefly,
but ether_input() consumes the packet so it was not just a matter of
adding one call. Probably needs a 2 calls: a copy too.

Whatever the fix will turn out to be, SIMH (used for VAX emulation) will
also benefit from it, and hence the VAX port.

[1] and in this emulator, like others I have seen, it attaches bpf(4) to
the real interface of course, not to the tap(4).

> In that case, the API that you're missing is something that does "create
> a tap interface and tell me its unit, but don't open it".  Or, simpler,
> a way of closing /dev/tap (lack of unit number is important) without
> destroying the device.

For sockets there is shutdown(2), maybe the idea could be extended to
other types of file descriptors.

> > - Maybe tap should silently never set the IFF_PROMISC flag since it
> >   already is, and setting it has unexpected side effects.
> >   See diff, but untested.
> 
> The fact that it doesn't affect its operation doesn't mean it will
> always be the case;  you could imagine having a filter in the write
> handler...  It might actully provide a way to test multicast stuff or
> whatnot.  The initial idea is that for all intents and purposes, it is
> an Ethernet device.

In such a case, the manual should also mention things like size checks,
which are currently explicitly not done.

> Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
-Olaf.
-- 
___ Olaf 'Rhialto' Seibert      -- You author it, and I'll reader it.
\X/ rhialto/at/xs4all.nl        -- Cetero censeo "authored" delendum esse.