tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: ALTQ caller refactor



   Date: Mon, 7 Mar 2016 17:39:15 +0900
   From: Kengo NAKAHARA <k-nakahara%iij.ad.jp@localhost>

   On 2016/03/04 12:23, Taylor R Campbell wrote:
   > One of the questions from the thread you cited is whether you can put
   > it in the mbuf packet header instead of using an m_tag.  I have no
   > opinion on that, but it would be nice to address that question.

   Hmm, I feel it is excessive to put mbuf packet. My main intention is
   not increasing ALTQ performance but just removing pattr argument from
   IFQ_ENQUEUE. Hmm..., maybe I don't have a sense...

Fair enough!

   > One comment on the patch after a quick glance:
   > 
   >    +       altq_pattr_cache = pool_cache_init(sizeof(struct altq_pktattr)
   >    +           + sizeof(struct m_tag),
   >    +           0, 0, 0, "altqmtag", NULL, IPL_NET, NULL, NULL, NULL);
   > 
   > The addition of sizes here may not necessarily work.  In particular,
   > the pointer returned from pool_cache_get will be aligned, but adding
   > sizeof(struct altq_pktattr) to that pointer does not necessarily give
   > an aligned result:
   > 
   > 	p = pool_cache_get(altq_pattr_cache, ...);
   > 	pktattr = p;
   > 	tag = (char *)p + sizeof(struct altq_pktattr);

   Hmm..., sorry, I am confused. Just to confirm, m_tag needs metadata
   (struct m_tag) in the front of m_tag data(e.g. in this case struct
   altq_pktqttr), so both struct m_tag and sturct altq_pktattr are
   necessary, I think. It is similar to malloc(len + sizeof(struct m_tag))
   in m_tag_get().
   Could you tell me about the reason which the addition of sizes may not
   necessarily work in detail?

Suppose we had this definition of struct m_tag:

struct m_tag {
	char	foo;
};

with sizeof(struct m_tag) == 1.  Suppose we ask for
m_tag_get(sizeof(uint64_t)), which is essentially

	struct m_tag *p = malloc(sizeof(struct m_tag) + sizeof(uint64_t)).

Now p will be aligned for any kind of object.  Suppose it has 8-byte
alignment.  If we compute

	uint64_t *q = (uint64_t *)((char *)p + sizeof(struct m_tag)),

then, because p is 8-byte-aligned and sizeof(struct m_tag) == 1, we
will get (char *)p + 1, which is an unaligned pointer to uint64_t --
slow on x86 and forbidden altogether on sparc64.

If, on the other hand, we did this:

	struct mtag64 {
		struct m_tag	t;
		uint64_t	u64;
	};
	p = m_tag_get(sizeof(struct mtag64) - sizeof(struct m_tag));
	/* i.e., p = malloc(sizeof(struct mtag64)); */
	q = &((struct mtag64 *)p)->u64;
	/* or, better: q = container_of(p, mtag64, t)->u64; */

then the compiler would insert padding between t and u64 to make q an
aligned pointer to uint64_t.

It happens to be the case that our definition of struct m_tag includes
a pointer, and usually that's sufficient to cause the compiler to
insert enough padding to make the idiom

	uint64_t *q = (uint64_t *)((char *)p + sizeof(struct m_tag))

work.  The same goes for struct altq_pktattr.

Presumably that is why this has not been a problem in practice.  But
it is not hard to imagine an architecture where uint64_t requires
8-byte alignment even though pointers require only 4-byte alignment.
E.g., it wouldn't surprise me if MIPS N32 or similar required this --
pointers are 32-bit and need only 4-byte alignment but the native
64-bit words need 8-byte alignment.

In any case, whether or not you change anything about the existing
m_tag_get code or users, I suggest that you use the struct and
container_of idiom in any new code.


Home | Main Index | Thread Index | Old Index