Subject: Re: bpf/pcap performance
To: Darren Reed <darrenr@reed.wattle.id.au>
From: Guy Harris <guy@alum.mit.edu>
List: tech-net
Date: 04/09/2004 15:09:32
On Apr 9, 2004, at 2:30 PM, Darren Reed wrote:

> In some email I received from Guy Harris, sie wrote:
>>> * the application is threaded, one thread uses select over all the
>>>   NICs so it knows when to read data from BPF, the other writes to
>>>   disk.
>>
>> The original BPF implementation didn't correctly support "select()" on
>> BPF devices if you had a timeout on the device - "select()" wouldn't
>> consider the BPF device readable until the hold buffer was non-empty,
>> but the store buffer wasn't rotated into the hold buffer until it
>> filled up, so "select()" would wait until the store buffer filled.
>>
>> FreeBSD fixed that somewhere in the 4.x timeframe, and I *think*
>> OpenBSD also has it fixed; NetBSD still doesn't have it fixed, as far
>> as I know.
>
> Ok, I went looking.  I think the bug you are talking about here relates
> to bpfread() ?

No, it relates to "bpfpoll()".

>   NetBSD has:
>
>         while (d->bd_hbuf == 0) {
>                 if (d->bd_immediate) {
>                         if (d->bd_slen == 0) {
>                                 splx(s);
>                                 return (EWOULDBLOCK);
>                         }
>
> FreeBSD:
>         while (d->bd_hbuf == 0) {
>                 if ((d->bd_immediate || timed_out) && d->bd_slen != 0) 
> {
> OpenBSD:
>         while (d->bd_hbuf == 0) {
>                 if (d->bd_immediate && d->bd_slen != 0) {

Those differences are unrelated to the select issue.  It *looks* as if 
the NetBSD one would cause a read in immediate mode *never* to block, 
regardless of whether the file descriptor is in non-blocking mode or 
not.  That was introduced in change 1.36; that change says "Sync with 
bpf-1.2a1", but the bpf-1.2a1 from LBL NRG 
(ftp://ftp.ee.lbl.gov/bpf.tar.Z) doesn't do that - it does

         while (d->bd_hbuf == 0) {
#ifndef hp300
#if BSD < 199103
                 if (uio->uio_fmode & (FNONBIO|FNDELAY))
#else
                 if (ioflag & IO_NDELAY)
#endif
                 {
                         if (d->bd_slen == 0) {
                                 splx(s);
                                 return (EWOULDBLOCK);
                         }
                         ROTATE_BUFFERS(d);
                         break;
                 }
#endif
                 if (d->bd_immediate && d->bd_slen != 0) {
                         /*
                          * A packet(s) either arrived since the previous
                          * read or arrived while we were asleep.
                          * Rotate the buffers and return what's here.
                          */
                         ROTATE_BUFFERS(d);
                         break;
                 }
                 error = BPF_SLEEP((caddr_t)d, PRINET|PCATCH, "bpf",
                                   d->bd_rtout);

See kern/8674 "bpf ioctl BIOCIMMEDIATE doesn't behave as it should" - 
the description says

         The BIOCIMMEDIATE bpf ioctl() basically behaves just like a
         non-blocking open.  This doesn't match what the man page says, 
and
         certainly doesn't match what you whould expect.  According to 
the man
         page, in immediate mode read()'s will still block in this mode, 
they
         will just return immediately on any received packet.

so the NetBSD code should probably be changed to match the FreeBSD code 
- which is what your patch does, so it would, I suspect, fix kern/8674 
as well.

> FreeBSD also has a bunch of other changes with the use of callouts,
> that according to the commit comment, relate to threads:
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/net/bpf.c
> - search for rev 1.86.

Those are the changes to make "select()" work correctly; the checkin 
comment is

	Make bpf's read timeout feature work more correctly with
	select/poll, and therefore with pthreads.

"...therefore with pthreads" means that with the old userland 
"select()"/"poll()"-based pthreads, if BPF doesn't work correctly with 
"select()" or "poll()", a threaded program won't work correctly with 
BPF.  (Presumably, with a pthreads library using KSE's in FreeBSD or 
scheduler activations in NetBSD, that problem won't occur.)