Re: kern/53962: npf: weird 'stateful' behavior

To: kern-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,fstd.lkml%gmail.com@localhost
Subject: Re: kern/53962: npf: weird 'stateful' behavior
From: fstd.lkml%gmail.com@localhost
Date: Sun, 10 Feb 2019 02:05:01 +0000 (UTC)

The following reply was made to PR kern/53962; it has been noted by GNATS.

From: fstd.lkml%gmail.com@localhost
To: gnats-bugs%netbsd.org@localhost
Cc: tech-net%netbsd.org@localhost, rmind%netbsd.org@localhost
Subject: Re: kern/53962: npf: weird 'stateful' behavior
Date: Sun, 10 Feb 2019 03:03:14 +0100

 > npf.conf:
 > | procedure "log" {
 > |         log: npflog0
 > | }
 > | 
 > | group "net1" on wm1 {
 > |         pass stateful in final proto tcp from 192.168.1.0/24 to 192.168.2.0/24 port 22 apply "log"
 > |         block all apply "log"
 > | }
 > | 
 > | group "net2" on wm2 {
 > |         pass stateful out final proto tcp from 192.168.1.0/24 to 192.168.2.13 port 22 apply "log"
 > |         block all apply "log"
 > | }
 > | 
 > | group default {
 > |         pass final on lo0 all
 > |         block all apply "log"
 > | }
 > 
 > $ nc 192.168.2.13 22  #run on 192.168.1.14
 > 
 > npflog0:
 > | 00:36:52.006564 rule 5.rules.0/0(match): pass in on wm1: 192.168.1.14.42718 > 192.168.2.13.22: Flags [S], seq 2274723816, win 29200, options [mss 1460,sackOK,TS val 561131037 ecr 0,nop,wscale 7], length 0
 > | 00:36:52.006639 rule 8.rules.0/0(match): pass out on wm2: 192.168.1.14.42718 > 192.168.2.13.22: Flags [S], seq 2274723816, win 29200, options [mss 1460,sackOK,TS val 561131037 ecr 0,nop,wscale 7], length 0
 > | 00:36:52.007299 rule 9.rules.0/0(match): block in on wm2: 192.168.2.13.22 > 192.168.1.14.42718: Flags [S.], seq 283128975, ack 2274723817, win 28960, options [mss 1460,sackOK,TS val 4155176782 ecr 561131037,nop,wscale 6], length 0
 > 
 > Now the router is blocking the SYN/ACK -- why?  The rule was 'stateful' and therefore state should've been kept, no?

 Ok what seems to be going on is twofold:

 1)

 Whenever a packet arrives, npf tries to retrieve, from the 'connection db', a connection that relates to that packet.  The key that this lookup is done with, i.e. what identifies this connection, is derived (in the case of TCP) from (src/dst port, src/dst addr, protocol). (Note that 'interface' is absent from that list.).  The result of this lookup is a 'connection' object which keeps the connection's state.  (cf. npf_conn_conkey() and connkey_setkey() in sys/net/npf/npf_conn.c)

 It follows that it's impossible to keep state on two connections that only differ by interface.  The connection objects can represent it, but the keys they'd be stored with in the connection db would collide.  (The semantics of the connection db are to refuse to insert a key that already exists, rather than overwriting it.).

 That means (looking at the above npf.conf) that an ingressing packet on wm1 will create a connection to keep state on; then upon egress of wm2 there's technically another connection to keep state on (same parameters, different interface), but the latter connection fails to be inserted into the connection db because of a key collision with the former.

 2)

 npf considers a connection to have a "direction" (i.e. the direction of the initial SYN), and essentially assumes that a "forwards" packet will only ever INgress on an interface, and a "backwards" packet will only ever Egress from an interface (or the other way around, depending on whether the SYN in- or egressed).  This assumption is obviously not true on, say, a router, where one and the same packet may ingress on one interface, and egress out on another.  The piece of code that does this is in npf_conn_ok() in sys/net/npf/npf_conn.c:

 | /*
 |  * npf_conn_ok: check if the connection is active and has the right direction.
 |  */
 | static bool
 | npf_conn_ok(const npf_conn_t *con, const int di, bool forw) //di=2 forw=1
 | {
 | 	const uint32_t flags = con->c_flags;
 | 
 | 	/* Check if connection is active and not expired. */
 | 	bool ok = (flags & (CONN_ACTIVE | CONN_EXPIRE)) == CONN_ACTIVE;
 | 	if (__predict_false(!ok)) {
 | 		return false;
 | 	}
 | 
 all good until here, but now...: ('di' is direction, flags is either 1 (ingress) or 2 (egress), PFIL_ALL is 3)
 | 	/* Check if the direction is consistent */
 | 	bool pforw = (flags & PFIL_ALL) == (unsigned)di;
 | 	if (__predict_false(forw != pforw)) {
 | 		return false;
 | 	}
 | 	return true;
 | }

 When commenting out that last check, the connection is later still discarded for having the wrong interface (as it should).  But since the connection is tied to a particular interface, that interface should've been part of the key to the connection db in the first place.  However I realize that if that were the cae, it'd be difficult to implement interface-agnostic state ("stateful-ends").

 Speaking of stateful-ends, when the direction-check is commented out, a single 'stateful-ends' ingress rule gives me exactly the good old ipf "keep state" behavior (if the packet is accepted into the filter, it's implicitly permitted out of the filter, on whatever interface).  So that's a workaround I can live with for now, although of course I'm not entirely sure of the purpose of this direction check, or the consequences of removing it.

 Any insights?

Prev by Date: Re: port-xen/53965: XEN DomU fails to poweroff in new HEAD kernels
Next by Date: port-arm/53967: CURRENT crashes on Raspberry Pi 3
Previous by Thread: Re: kern/53962: npf: weird 'stateful' behavior
Next by Thread: Re: kern/53962: npf: weird 'stateful' behavior
Indexes:

Home | Main Index | Thread Index | Old Index