tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Lua in-kernel (lbuf library)



On Tue, Oct 15, 2013 at 06:01:29PM -0300, Lourival Vieira Neto wrote:
> > Also, having to switch mentally between zero-based arrays in the kernel C
> > code and 1-based arrays in the Lua code make my head ache.
> It's something that doesn't bug me so much.. But, if necessary it
> could be changed to 0-based in this userdata.

When you create your own data structures, I guess it is a wash.  You have
to adjust +/-1 in infrequent circumstances in either scenario.

But in this case you are creating a special purpose language that operates
in universe of zero-based array.  And that's not only the kernel code.
Every Internet protocol specification that I remember is using zero-based
indexing.  For someone dealing with both sides (the world and your lua
library), it makes the difference between constantly having to be alert
to remember to do the offset adjustment.  That is a lot more mental work
for anyone working with this library.

If you use 1-based indices talking to protocol people will be funny too:

``Anyone know why the flags in byte 6 of this packet are funny?''
``Sure, that's most likely because the flags are in byte 5.''

I think it is worth thinking hard about this.

From a cursory reading of the Wireshark Lua API, it seems to me they are
using 0-based indices too.

> > I what is "buffer" and how does it relate to mbufs?
> Note, non-contiguous buffer still an open problem in lbuf. I don't
> know if should use a ptrdiff_t  to pass the distance to 'next' field,
> a 'next()' to return 'next' field or something else.

You seem to be talking about implementation.  I was talking about the
interface of the library.

> However, you could create a lbuf from a mbuf header as follows:
>   lbuf_new(L, mbuf->m_data, mbuf->m_len, NULL, true);

I don't think that is a good way.  You say you want to inspect packet
data in the kernel.  Well, the packet's data can be spread over a
chain of mbufs.  Also, mbufs may have internal or externel storage.
You don't want to deal with that as the user of your library.

As a user, I want an interface like this:
  lbuf_from_mbuf(L, mbuf, NULL, true);

That would make the contents of the mbuf chain starting at "mbuf"
available in an array-of-bytes like fashion.  Length isn't needed as
it is computed from the mbuf chain.

> Yes, mea culpa =(. I wasn't clear about that. 'net' flag was the way I
> found to 'record' the buffer endianness. What means, true if the
> buffer uses BE and false if it uses HE. It has the same semantics of
> hton* and ntoh* functions. Don't know if it is better to pass the
> endianness itself as a flag (e.g., enum { BIG_ENDIAN, LITTLE_ENDIAN,
> HOST_ENDIAN }). What do you think?

For me the the most convenient interface would be if I didn't have to
mention the host byteorder.  Just record what byteorder the buffer is in,
and convert when appropriate.  Alan made a good point.  It maybe be
convenient and/or necessary to specify a different byteorder in a mask.

>   buf:rawget(0, 9) ~> if net flag is *true*: takes 16 bits from

You know, I think rawget is badly named.  "Raw" implies unmodified.
And byteswapping is a form of modification.  Maybe you can find a better
name.

> >> buf:mask(3)
> >> buf[3] ~> accesses 3 bits from bit-6 position
> >
> > What does that mean? Does it return the top-most 2 bits from the first
> > byte plus the least significant bit fom the second byte of the buffer?
> 
> It means the least-most 2 bits from the first byte and the LSB from the 
> second.

I don't know what "least-most" means.  But since you don't seem to
agree with my questions I assume you intend that to be the opposite of
"top-most" then your statement doesn't make sense to me.  Translated
into normal terms it would return bits 0 and 1 of the first byte and bit
0 of the second one.  They are not even contigous.

> > What is 'length' for?
> 
> Offset and length could be used to impose boundaries to the mask.

Don't tell it to me.  Write it into the documentation. :-)

> >> buf:mask{ 2, 2, 32, 9 }
> >> buf[2] ~> accesses 2 bits from bit-2 position
> >
> > What exactly would "buf[3]" return.  Please be explicit in whether you are
> > counting byte offsets or bit offsets.  I can't figure that out from your
> > description.
> 
> It would return 32 bits (converted or not, depending on 'net' flag)
> from bit-4 (MSB-ordered).

From "bit-4"? From your own use of 1-based indices don't you mean the
5th bit?

> > Personally, the idea of making array access to the buffer depend on
> > state stored in the buffer does not look appealing to me.  It prevents
> > buffers to be passed around because consumers don't know what they will
> > get back on array access.
> 
> I think it could be useful to access nonaligned and aligned data
> easily without caring about naming fields.

I guess I did not make myself understandable.  Storing that state globally
means that every function you hand an lbuf too has to set the global
mask before it accesses the lbuf, because it can't know if the current
mask is compatible with its own use.  And after call to a function, the
caller has to reset the mask, because it could have been changed.

so you end up with code like:

  function filter1(packet)
    packet:mask(whatever)
    ...
    filter2(packet)
    packet:mask(whatever)  -- restore our mask
    ...
    filter3(packet)
    packet:mask(whatever)  -- restore our mask
    ...
  end

  function filter2(packet)
    packet:mask(something_else)
    ...
    filter4(packet)
    packet:mask(something_else)
    ...
  end

  function filter3(packet)
    packet:mask(entirely_different)
    ...
  end

That, of course, is completely idiotic.

The newbuf = buf:mask() functionalty completely avoids that problem and is
sufficient.  I would provide only that interface.

--chris


Home | Main Index | Thread Index | Old Index