tech-net: Re: Bugs in PF_KEY marshalling, socket-buffer overflow

Subject: Re: Bugs in PF_KEY marshalling, socket-buffer overflow
To: Michael Richardson <mcr@sandelman.ottawa.on.ca>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-net
Date: 05/20/2004 13:13:37
In message <13067.1085060353@marajade.sandelman.ottawa.on.ca>,
Michael Richardson  writes:

[[ Linux /proc misfeatures ]]

>that the file is formatted, at the 4k boundary, there can be corruption.
>  (Linux 2.6 fixes much of the proc file system's race conditions, and
>the correct answer is to have a directory of files, one per SPD entry)

Huh? Thats not a ``correct answer''. That's kludging the answer to
avoid the bugs and design flaws in the implementation. Or (to steal a
phrase from Roger Zelanzy) its like researching a disfiguring disease,
catching the disease, and then claiming that one's self, it looks
quite fetching.

OTOH, its not really on-topic here, so lets drop it.

>  As you say, RFC-2367 is Informational.
>  There is a lot of interest in revising it, but no time.

Please, do please let me know if anyone starts.  If nothign else, to
make sure that the amateurish design flaws and and outright bugs in
the KAME implementation are explicitly forbidden in any revised version.


[... broadcast]

>  Yes, true. But, because it is sometimes a broadcast system, it means
>that it is hard to have clear back-pressure (not impossible, just hard).

Heck, I'd settle for reliable notification that a DUMP respsonse or an
AQCQUIRE got dropped, so that the app could recover (instead of
deadlocking, as the KAME code does when a DUMP response stream is
truncated).  Yet I am led to understand that the KAME team takes
exception to using the word `bug' for such behaviour.
To paraphrase:

 ``Its widely known that KAME PF_KEY is impossible to use on systems
   with large numbers of SAs or with large numbers of  policies.
   But that's not a bug; RFC-2367 says PF_KEY doesn't have to
   be reliable.''

[...]


>  Okay. I was surprised to hear that it did that as well, but prepared
>to believe you.

Here's exactly what I get from the mutant racoon in my own tree, after
processing about 400 ACQUIREs out of a back-to-back burst of 600:

2004-05-19 14:23:14: INFO: pfkey.c:200: get pfkey ACQUIRE message
libipsec pfkey_check: bad family 0
2004-05-19 14:23:14: ERROR: pfkey.c:231: libipsec failed pfkey check
(Invalid address family)
2004-05-19 14:23:15: INFO: pfkey.c:200: get pfkey ACQUIRE message
libipsec pfkey_check: bad family 0
2004-05-19 14:23:15: ERROR: pfkey.c:231: libipsec failed pfkey check
(Invalid address family)

after which racoon sees no more of the ACQUIREs.  I have never, ever
see these messages under lower ACQUIRE rates.  For example, the same
total number of AQCQUIREs in slowly over several minutes does not
trigger the same behaviour.  My diagnosis is that the so_rcv queue
overflowed, the last message was partially truncated, and the  subsequent
ACQUIRE messages were dropped.

Though (as I said) this may well turn out to be my bug. I'll know for
sure just as soon as I pinpoint it and fix it.