Subject: kern/35247: pf drops packets on connections with high window scaling
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <perry@piermont.com>
List: netbsd-bugs
Date: 12/13/2006 16:30:00
>Number:         35247
>Category:       kern
>Synopsis:       pf drops packets on connections with high window scaling
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 13 16:30:00 +0000 2006
>Originator:     Perry E. Metzger
>Release:        NetBSD 4.99.3
>Organization:
Perry E. Metzger		perry@piermont.com
--
"Ask not what your country can force other people to do for you..."
>Environment:
	
	
System: NetBSD hackworth 4.99.3 NetBSD 4.99.3 (HACKWORTH) #0: Fri Oct 27 14:05:48 EDT 2006 perry@hackworth:/usr/obj/sys/arch/i386/compile/HACKWORTH i386
Architecture: i386
Machine: i386
>Description:

This bug is a bit difficult to explain, but quite frustrating to
experience.

The bug, as the user experiences it, is that when pf is used as a
stateful firewall/nat box, connections to particular sites will
mysteriously fail to work. (At the moment, en.wikipedia.org is a good
example, but that would change if they altered the redirector box they
are using.) The connections are either amazingly slow or fail
entirely, although they start okay.

Examination of the TCP session via tcpdump reveals that immediately
after the initial handshake, data packets from the other side are
being blocked as "state mismatches" -- they appear to pf to be outside
of the allowed window.

Careful checking of the sessions reveals that the key is that this
only happens to sites that have large window scaling factors set
during the TCP handshake.

The apparent cause here is that the math inside pf.c that deals with
window scaling is flawed.

Matt Thomas worked out a partial/possible fix that was committed
last night, but it does not appear to fully alleviate the problem. In
particular, after his fix is applied, downloads from wikipedia (which
has servers that do a window scale of 9) work, but edit/posts of large
articles fail just as the downloads used to.

Matt noted (correctly) that window scaling must not be applied to the
window size revealed during the initial SYN packet (see the RFC for
why), and patched pf to stop that from happening. However, my
suspicion is that the calculation is more broken than that, and that
the something further is going wrong with window size calculations
when window scales are large.

>How-To-Repeat:

See above.

>Fix:
	

See above for ideas.

>Unformatted: