Re: Improving the data supplied by BPF

To: darrenr%netbsd.org@localhost
Subject: Re: Improving the data supplied by BPF
From: "Arnaud Lacombe" <lacombar%gmail.com@localhost>
Date: Thu, 25 Dec 2008 15:58:54 -0500

Hi,

On Thu, Dec 25, 2008 at 11:35 AM, Darren Reed <darrenr%netbsd.org@localhost> 
wrote:
> Recently I've talked with a few different folks about packet capture
> and have become aware of some of the problems that people face when
> trying to use BPF vs other propritary solutions that exist. While it
> may be possible to capture data at a good rate with BPF, there is
> important meta data that isn't provided.
>
could you details what BPF is missing vs. other proprietary solutions
? What a heavy tcpdump user can expect compared to the actual one ?

> This set of diffs attempts to address that by introducing a new BPF
>
maybe your changes would be clearer if you only provided the diff made
on BPF itself (about 10% of the whole diff), and a sample use-case.
Everything else is only API change.

> The purpose of the sequence number is to provide the rolling counter
> of the packets captured for the one in question. Thus if in successive
> reads the count went from 2 to 5, you know 3 packets have been missed.
>
what if the count goes from 3 to... 3, ie. the seq number overflowed
(for whatever reason) ?

> /*
>  * Enhanced BPF packet record structure
>  */
> typedef struct ebpf_rec_s {
>       uint64_t        ebr_secs;       /* No more Y2k38 problem */
why unsigned ? currently `tv_sec' is signed. Why not using time_t ?
There is an obvious ABI breakage when we will switch to 64bits time_t
but this is be a better type than raw integer. The breakage is a
different trouble and should be dealt with separately.

>       uint32_t        ebr_nsecs;
>
why do you want nano second precision if you getting your information
from a micro second precision variable. There is no information gain
there, and your code reflect this (ie. you just "* 1000" to get the
nano second value from the micro second value).

This field would have a meaning if you change the call the call to
microtime() to nanotime() in bpf_tap()/bpd_deliver() and build a
homegrown `struct timeval' in the non-extended capture format. You
don't have any precision loss in that case.

btw, why not just using a `struct timespec' ?

>       uint32_t        ebr_seqno;      /* sequence number in capture */
how to detect wrap in sequence number ?

As we have timestamps, this can be use to order sequence number as
done with TCP's PAWS I guess.

>       uint32_t        ebr_flags;
>       uint32_t        ebr_rlen;       /* 16 bits is not enough for
> IPv6   */
>       uint32_t        ebr_wlen;       /* Jumbograms, so we have to
> use    */
>       uint32_t        ebr_clen;       /* 32 bits to represent all
> lengths */
>       uint32_t        ebr_pktoff;
>       uint16_t        ebr_type;       /* DLT_* type */
>       uint16_t        ebr_subtype;
> } ebpf_rec_t;
>
> /*
>  * rlen = total record length (header + packet)
>  * wlen = wire length of packet
>  * clen = captured length of packet
>  * pktoff = offset from ebr_secs to the start of the packet data (may not be
>  *          the same as sizeof(ebr_rec_t))
>  *
>  * flags are asa below:
s/asa/as/ :)

>  */
> #define        EBPF_OUT                0x00000001      /* Transmitted
> packet */
>
I guess there will also be EBPF_IN, do you forsee any other flag possible ?

Many thanks,

 - Arnaud

Follow-Ups:
- Re: Improving the data supplied by BPF
  - From: Darren Reed
- Re: Improving the data supplied by BPF
  - From: Christos Zoulas

References:
- Improving the data supplied by BPF
  - From: Darren Reed

Prev by Date: Re: Improving the data supplied by BPF
Next by Date: Re: Improving the data supplied by BPF
Previous by Thread: Re: Improving the data supplied by BPF
Next by Thread: Re: Improving the data supplied by BPF
Indexes:

Home | Main Index | Thread Index | Old Index