NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/49264: vlan(4): concurrent executions of ifconfig cause a fatal page fault



With the same configuration, I got another kind of fatal page
faults (see backtraces below).

In both cases, it seems that a ifnet data of vlan encounters
use after free. I can work around the issue with this patch:

diff --git a/sys/net/if_vlan.c b/sys/net/if_vlan.c
index 70a5940..d6aac2c 100644
--- a/sys/net/if_vlan.c
+++ b/sys/net/if_vlan.c
@@ -251,10 +251,10 @@ vlan_clone_destroy(struct ifnet *ifp)
        s = splnet();
        LIST_REMOVE(ifv, ifv_list);
        vlan_unconfig(ifp);
-       splx(s);

        if_detach(ifp);
        free(ifv, M_DEVBUF);
+       splx(s);

        return (0);
 }

I'm not sure if this fix is correct.

  ozaki-r

==== bracktrace #1 ====
uvm_fault(0xfffffe8002fedcf8, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff80449d73 cs 8 rflags 10246 cr2 18
ilevel 6 rsp fffffe8001c8bac8
curlwp 0xfffffe80024aa280 pid 279.1 lowest kstack 0xfffffe8001c882c0
kernel: page fault trap, code=0
Stopped in pid 279.1 (dhcpcd) at        netbsd:in6_setscope+0x10:
 movq    1
8(%rax),%r12
db{3}> bt
in6_setscope() at netbsd:in6_setscope+0x10
in6_control1() at netbsd:in6_control1+0x702
in6_control() at netbsd:in6_control+0x6b
udp6_ioctl_wrapper() at netbsd:udp6_ioctl_wrapper+0x32
compat_ifioctl() at netbsd:compat_ifioctl+0x116
doifioctl() at netbsd:doifioctl+0x43a
soo_ioctl() at netbsd:soo_ioctl+0x2af
sys_ioctl() at netbsd:sys_ioctl+0x17e
syscall() at netbsd:syscall+0x9a
--- syscall (number 54) ---
7f7ff74cea0a:
==== End of bracktrace #1 ====

==== bracktrace #2 ====
uvm_fault(0xfffffe80038adcf8, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff804381e5 cs 8 rflags 10286 cr2 0 ilevel
4 rsp fffffe8001cb6cc8
curlwp 0xfffffe8003add6c0 pid 279.1 lowest kstack 0xfffffe8001cb32c0
kernel: page fault trap, code=0
Stopped in pid 279.1 (dhcpcd) at        netbsd:sysctl_rtable+0x1c4:
 movq    0(%rax),%rax
db{0}> bt
sysctl_rtable() at netbsd:sysctl_rtable+0x1c4
sysctl_dispatch() at netbsd:sysctl_dispatch+0xc4
sys___sysctl() at netbsd:sys___sysctl+0xd0
syscall() at netbsd:syscall+0x9a
--- syscall (number 202) ---
7f7ff74fe8ca:
==== End of bracktrace #2 ====

On Fri, Oct 10, 2014 at 12:00 AM,  <ozaki-r%netbsd.org@localhost> wrote:
>>Number:         49264
>>Category:       kern
>>Synopsis:       vlan(4): concurrent executions of ifconfig cause a fatal page fault
>>Confidential:   no
>>Severity:       critical
>>Priority:       medium
>>Responsible:    kern-bug-people
>>State:          open
>>Class:          sw-bug
>>Submitter-Id:   net
>>Arrival-Date:   Thu Oct 09 15:00:00 +0000 2014
>>Originator:     Ryota Ozaki
>>Release:        current
>>Organization:
>>Environment:
> NetBSD kvm 7.99.1 NetBSD 7.99.1 (KVM) #89: Thu Oct  9 20:43:55 JST 2014  ozaki-r@(hidden):(hidden) amd64
>>Description:
> Run ifconfig vlan0 -vlanif vioif0 and ifconfig vlan0 destroy in parallel with some load, then a fatal page fault sometimes occurs:
>
>   uvm_fault(0xfffffe8002e14188, 0x0, 1) -> e
>   fatal page fault in supervisor mode
>   trap type 6 code 0 rip ffffffff8025cc44 cs 8 rflags 10246 cr2 50 ilevel 6 rsp fffffe8000bacc08
>   curlwp 0xfffffe8000d48440 pid 2376.1 lowest kstack 0xfffffe8000ba92c0
>   kernel: page fault trap, code=0
>   Stopped in pid 2376.1 (ifconfig) at     netbsd:vlan_unconfig+0x32:      cmpb    $0x6,50(%rax)
>   db{0}> bt
>   vlan_unconfig() at netbsd:vlan_unconfig+0x32
>   vlan_ioctl() at netbsd:vlan_ioctl+0x235
>   doifioctl() at netbsd:doifioctl+0x2d8
>   soo_ioctl() at netbsd:soo_ioctl+0x2af
>   sys_ioctl() at netbsd:sys_ioctl+0x17e
>   syscall() at netbsd:syscall+0x9a
>   --- syscall (number 54) ---
>   7f7ff6ccea0a:
>
> vlan_unconfig+0x32 is here:
>
>   switch (ifv->ifv_p->if_type) {
>
> is the source code. ifv->ifv_p is NULL at that point unexpectedly. Non-NULL check of ifv->ifv_p is done at the beginning of the function, so another LWP has run between the check and the above point.
>
> vlan_unconfig is protected by splnet and KERNEL_LOCK in soo_ioctl, but (*ifv->ifv_msw->vmsw_purgemulti)(ifv) in vlan_unconfig may sleep and thus a LWP can enter the function while an original LWP is sleeping there.
>
> We have to serialize executions of vlan_unconfig somehow.
>>How-To-Repeat:
> Run the following script with some load:
>   while true; do
>     ifconfig vlan0 create
>     ifconfig vlan0 vlan 10 vlanif vioif0
>     ifconfig vlan0 -vlanif vioif0 &
>     ifconfig vlan0 destroy
>   done
>
>>Fix:
> Introduce a mutex to protect vlan_unconfig.
>
> diff --git a/sys/net/if_vlan.c b/sys/net/if_vlan.c
> index 5c75e34..70a5940 100644
> --- a/sys/net/if_vlan.c
> +++ b/sys/net/if_vlan.c
> @@ -180,6 +180,8 @@ void                vlanattach(int);
>  /* XXX This should be a hash table with the tag as the basis of the key. */
>  static LIST_HEAD(, ifvlan) ifv_list;
>
> +static kmutex_t ifv_mtx __cacheline_aligned;
> +
>  struct if_clone vlan_cloner =
>      IF_CLONE_INITIALIZER("vlan", vlan_clone_create, vlan_clone_destroy);
>
> @@ -191,6 +193,7 @@ vlanattach(int n)
>  {
>
>         LIST_INIT(&ifv_list);
> +       mutex_init(&ifv_mtx, MUTEX_DEFAULT, IPL_NONE);
>         if_clone_attach(&vlan_cloner);
>  }
>
> @@ -358,9 +361,15 @@ static void
>  vlan_unconfig(struct ifnet *ifp)
>  {
>         struct ifvlan *ifv = ifp->if_softc;
> +       struct ifnet *p;
>
> -       if (ifv->ifv_p == NULL)
> +       mutex_enter(&ifv_mtx);
> +       p = ifv->ifv_p;
> +
> +       if (p == NULL) {
> +               mutex_exit(&ifv_mtx);
>                 return;
> +       }
>
>         /*
>          * Since the interface is being unconfigured, we need to empty the
> @@ -370,20 +379,18 @@ vlan_unconfig(struct ifnet *ifp)
>         (*ifv->ifv_msw->vmsw_purgemulti)(ifv);
>
>         /* Disconnect from parent. */
> -       switch (ifv->ifv_p->if_type) {
> +       switch (p->if_type) {
>         case IFT_ETHER:
>             {
> -               struct ethercom *ec = (void *) ifv->ifv_p;
> +               struct ethercom *ec = (void *) p;
>
>                 if (ec->ec_nvlans-- == 1) {
>                         /*
>                          * Disable Tx/Rx of VLAN-sized frames.
>                          */
>                         ec->ec_capenable &= ~ETHERCAP_VLAN_MTU;
> -                       if (ifv->ifv_p->if_flags & IFF_UP) {
> -                               (void)if_flags_set(ifv->ifv_p,
> -                                   ifv->ifv_p->if_flags);
> -                       }
> +                       if (p->if_flags & IFF_UP)
> +                               (void)if_flags_set(p, p->if_flags);
>                 }
>
>                 ether_ifdetach(ifp);
> @@ -412,6 +419,8 @@ vlan_unconfig(struct ifnet *ifp)
>         if_down(ifp);
>         ifp->if_flags &= ~(IFF_UP|IFF_RUNNING);
>         ifp->if_capabilities = 0;
> +
> +       mutex_exit(&ifv_mtx);
>  }
>
>  /*
>


Home | Main Index | Thread Index | Old Index