NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/52643: npfctl reload on rump kernels fails randomly



On Mon, Oct 23, 2017 at 4:45 PM,  <ozaki-r%netbsd.org@localhost> wrote:
>>Number:         52643
>>Category:       kern
>>Synopsis:       npfctl reload on rump kernels fails randomly
>>Confidential:   no
>>Severity:       serious
>>Priority:       medium
>>Responsible:    kern-bug-people
>>State:          open
>>Class:          sw-bug
>>Submitter-Id:   net
>>Arrival-Date:   Mon Oct 23 07:45:00 +0000 2017
>>Originator:     Ryota Ozaki
>>Release:        -current (and maybe earlier releases)
>>Organization:
>>Environment:
> NetBSD rangeley 8.99.3 NetBSD 8.99.3 (RANGELEY) #77: Wed Oct  4 10:23:38 JST 2017  ozaki-r@rangeley:(hidden) amd64
>>Description:
> npfctl reload fails sometimes on rump kernels with EIO like this:
>
>   tc-so:Executing command [ env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=path=/rump,socket=all:nolocal,sysctl=yes,blanket=/dev/npf npfctl reload ./npf.conf ]
>   tc-se:Fail: incorrect exit status: 1, expected: 0
>   tc-se:stdout:
>   tc-se:
>   tc-se:stderr:
>   tc-se:npfctl: source  line 333501697
>   tc-se:npfctl: object: 535
>   tc-se:npfctl: npfctl_config_send: Input/output error
>   tc-se:
>
> The error message is generated by libnpf that uses libprop.
>
> npfctl uses npf_config_submit of libnpf to call ioctl to the kernel.
> libnpf uses prop_dictionary_sendrecv_ioctl to call ioctl and
> get an errno of the ioctl by prop_dictionary_get_int32.
> On npfctl reload on rump kernels, prop_dictionary_sendrecv_ioctl succeeds
> but prop_dictionary_get_int32 fails because errno isn't set correctly.
>
> It seems that npfctl_load that is a kernel function of npf and called
> via the prop_dictionary_sendrecv_ioctl doesn't set the errno to the return
> value of libprop: https://nxr.netbsd.org/xref/src/sys/net/npf/npf_ctl.c#635
>
>           /* Error report. */
>   #if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
>           prop_dictionary_set_int32(errdict, "errno", error);
>           prop_dictionary_copyout_ioctl(pref, cmd, errdict);
>           prop_object_release(errdict);
>           error = 0;
>   #endif
>           return error;
>
> The code setting an errno is trimmed on rump kernels since by default
> rump kernels set _NPF_TESTING. If we enable the code, npfctl reload
> works expectedly.
>
> Note that npfctl reload sometimes succeeds, I guess, because npf_config_submit
> tries to read an errno from a uninitialized memory and it sometimes works luckily.
>>How-To-Repeat:
> # From an ATF test script that is under development
>
> rump_server -lrumpnet -lrumpnet_net -lrumpnet_netinet -lrumpnet_shmif -lrumpdev -lrumpvfs -lrumpdev_bpf -lrumpnet_bpfjit -lrumpnet_npf -lrumpnet_netinet6 unix://npf_nat_local
>
> export RUMP_SERVER=unix://npf_nat_local
> rump.ifconfig shmif0 create
> rump.ifconfig shmif0 linkstr ./bus_npf_local
> rump.ifconfig shmif1 create
> rump.ifconfig shmif1 linkstr ./bus_npf_nat
>
> rump.sysctl -q -w net.inet.ip.dad_count=0
> rump.ifconfig shmif0 10.0.1.1/24
> rump.ifconfig shmif1 20.0.0.1/24
> rump.sysctl -q -w net.inet.ip.forwarding=1
> rump.route -n add -net 10.0.2.0 20.0.0.2
>
> cat > ./npf.conf <<-EOF
> set bpf.jit off
> \$int_if = inet4(shmif0)
> \$ext_if = inet4(shmif1)
> \$localnet = { 10.0.1.0/24 }
> map \$ext_if dynamic \$localnet -> \$ext_if
> group "external" on \$ext_if {
>         pass stateful out final all
> }
> group "internal" on \$int_if {
>         block in all
>         pass in final from \$localnet
>         pass out final all
> }
> group default {
>         pass final on lo0 all
>         block all
> }
> EOF
>
> env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=path=/rump,socket=all:nolocal,sysctl=yes,blanket=/dev/npf npfctl reload ./npf.conf
>>Fix:
> As described above, enabling the code fixes the issue, but I'm not sure
> if the fix is appropriate.
>
> diff --git a/sys/net/npf/npf_ctl.c b/sys/net/npf/npf_ctl.c
> index b1af2d347ac..43105db9c85 100644
> --- a/sys/net/npf/npf_ctl.c
> +++ b/sys/net/npf/npf_ctl.c
> @@ -633,12 +633,12 @@ fail:
>         prop_object_release(npf_dict);
>
>         /* Error report. */
> -#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
> +//#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
>         prop_dictionary_set_int32(errdict, "errno", error);
>         prop_dictionary_copyout_ioctl(pref, cmd, errdict);
>         prop_object_release(errdict);
>         error = 0;
> -#endif
> +//#endif
>         return error;
>  }
>

Further investigations found the following requirements:
- prop_object_release(errdict) must be always called
  because errdict = prop_dictionary_create() is always called
- prop_object_release(npf_dict) needs to be called only if
  npf_dict is allocated by prop_dictionary_copyin_ioctl_size
- _NPF_STANDALONE doesn't require to set prop
- For _NPF_TESTING, if npfctl_testing, setting prop isn't
  needed, otherwise it's needed

So I think a fix looks like the following patch. Comments?

  ozaki-r


--

diff --git a/sys/net/npf/npf_ctl.c b/sys/net/npf/npf_ctl.c
index b1af2d347ac..3e82db6bf73 100644
--- a/sys/net/npf/npf_ctl.c
+++ b/sys/net/npf/npf_ctl.c
@@ -630,15 +630,25 @@ fail:
        if (tblset) {
                npf_tableset_destroy(tblset);
        }
-       prop_object_release(npf_dict);
+#if defined(_NPF_TESTING) || defined(_NPF_STANDALONE)
+       if (!npfctl_testing)
+#endif
+               prop_object_release(npf_dict);

-       /* Error report. */
-#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
-       prop_dictionary_set_int32(errdict, "errno", error);
-       prop_dictionary_copyout_ioctl(pref, cmd, errdict);
+#ifndef _NPF_STANDALONE
+#ifdef _NPF_TESTING
+       if (!npfctl_testing) {
+#endif
+               /* Error report. */
+               prop_dictionary_set_int32(errdict, "errno", error);
+               prop_dictionary_copyout_ioctl(pref, cmd, errdict);
+               error = 0;
+#ifdef _NPF_TESTING
+       }
+#endif
+#endif /* _NPF_STANDALONE */
        prop_object_release(errdict);
-       error = 0;
-#endif
+
        return error;
 }


Home | Main Index | Thread Index | Old Index