NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-xen/47193: xenwatch handlers are added twice to the list of pending watches



On 16/11/12 19:02, Cherry G.Mathew wrote:
>>> Fix:
>> The fix is easy, we need to check if the watch handler has already been 
>> added to the queue of pending events before adding it. Here is a proposed 
>> patch.
>>
>> ---
>> diff --git a/sys/arch/xen/xenbus/xenbus_xs.c 
>> b/sys/arch/xen/xenbus/xenbus_xs.c
>> index c09e7c4..8cef74a 100644
>> --- a/sys/arch/xen/xenbus/xenbus_xs.c
>> +++ b/sys/arch/xen/xenbus/xenbus_xs.c
>> @@ -751,7 +751,7 @@ xenwatch_thread(void *unused)
>> static int
>> process_msg(void)
>> {
>> -    struct xs_stored_msg *msg;
>> +    struct xs_stored_msg *msg, *s_msg;
>>      char *body;
>>      int err;
>>
>> @@ -782,7 +782,7 @@ process_msg(void)
>>      body[msg->hdr.len] = '\0';
>>
>>      if (msg->hdr.type == XS_WATCH_EVENT) {
>> -            bool found;
>> +            bool found, repeated;
>>
>>              DPRINTK("process_msg: XS_WATCH_EVENT");
>>              msg->u.watch.vec = split(body, msg->hdr.len,
>> @@ -797,13 +797,20 @@ process_msg(void)
>>                  msg->u.watch.vec[XS_WATCH_TOKEN]);
>>              found = (msg->u.watch.handle != NULL);
>>              if (found) {
>> +                    repeated = false;
>> +                    SIMPLEQ_FOREACH(s_msg, &watch_events, msg_next) {
>> +                            if (s_msg->u.watch.handle == 
>> msg->u.watch.handle)
>> +                                    repeated = true;
>> +                    }
>> +            }
> 
> This bit is missing locking (mutex_enter/exit(&watch_events_lock))
> 
> You could avoid a second set of locking by re-using 'found' a bit like
> this:
> 
>         if (found) {
>                 mutex_enter(&watch_events_lock);
>                 SIMPLEQ_FOREACH(s_msg, &watch_events, msg_next) {
>                         if (s_msg->u.watch.handle == msg->u.watch.handle) {
>                                 found = 0;
>                                 goto isduplicate;
>                 }
> 
>                 SIMPLEQ_INSERT_TAIL(&watch_events, msg, msg_next);
> 
>                 cv_broadcast(&watch_cv);
>         isduplicate:
>                 mutex_exit(&watch_events_lock);
>         }
> 

Thanks for catching that, I prefer to avoid using this kind of jumps
inside functions (if it's not something like a fail or out path), see
the attached patch below.

>    
>> +            if (found && !repeated) {
>>                      mutex_enter(&watch_events_lock);
>>                      SIMPLEQ_INSERT_TAIL(&watch_events, msg, msg_next);
>>                      cv_broadcast(&watch_cv);
>>                      mutex_exit(&watch_events_lock);
>>              }
>>              mutex_exit(&watches_lock);
>> -            if (!found) {
>> +            if (!found || repeated) {
>>                      free(msg->u.watch.vec, M_DEVBUF);
>>                      free(msg, M_DEVBUF);
>>              }
>>
> 
> Not related to your code but is there a potential leak with malloc() for
> body = malloc(...) if (!found) {} ?

As far as I see, body is freed unconditionally when passed to the "split"
function. Here is a new patch that fixes the locking issue.

---
commit e1a4795f6116e68d6a1cd4388c1cf142a577de43
Author: Roger Pau Monne <roger.pau%citrix.com@localhost>
Date:   Wed Nov 14 17:25:17 2012 +0100

    xen: prevent adding duplicate xenwatches
    
    When a xenstore watch triggers, the event is processed on process_msg
    and if a valid handle it's found the handler is queued for execution
    on the pending xen watches queue (watch_events).
    
    This may present a problem if we trigger a xenwatch several times and
    then disconnect the device. If several xenwatch events are added to
    the watch_events queue and the device is disconnected afterwards, the
    first processed xenwatch event will disconnect the device, remove the
    watch and free all resources. This triggers a panic if there are
    pending xenwatch events for that device already queued in the local
    queue of the function xenwatch_thread, since when the next watch that
    has the same handler tries to execute we get a panic due to the fact
    that the device is already disconnected and all resources had been
    freed:
    
    xenbus_watch: 0xffffa0000b7cd1d0
    xbw_callback: 0xffffffff80755dd4
    otherend_changed: backend/vif/1/0
    backend/vif/1/0/state 6
    backend/vif/1/0 -> Closed
    backend/vif/1/0 -> backend_device, b_detach: 0xffffffff8075a2bf
    xenbus_watch: 0xffffa0000b7cd1d0
    xbw_callback: 0xfc5ec02183e547a8
    fatal protection fault in supervisor mode
    trap type 4 code 0 rip ffffffff80756596 cs e030 rflags 10246 cr2
    7f7ff7b4c020 ilevel 0 rsp ffffa000e6d82c50
    curlwp 0xffffa0000a72d580 pid 0 lid 36 lowest kstack
    0xffffa000e6d7f000
    kernel: protection fault trap, code=0
    Stopped in pid 0.36 (system) at netbsd:xenwatch_thread+0xc7:    call
    *10(%rax
    )
    xenwatch_thread() at netbsd:xenwatch_thread+0xc7
    ds          f
    es          5987
    fs          2c40
    gs          1460
    rdi         ffffa0000b7cd1d0
    rsi         ffffa0000a5477f0
    rbp         ffffa000e6d82c70
    rbx         ffffa0000b7c14c0
    rdx         2
    rcx         f
    rax         ffffa0000b7cd1d0
    r8          78
    r9          ffffffef
    r10         deadbeef
    r11         1
    r12         ffffa000e6d82c50
    r13         ffffa0000a72d580
    r14         ffffa0000a72d580
    r15         0
    rip         ffffffff80756596    xenwatch_thread+0xc7
    cs          e030
    rflags      10246
    rsp         ffffa000e6d82c50
    ss          e02b
    netbsd:xenwatch_thread+0xc7:    call    *10(%rax)

diff --git a/sys/arch/xen/xenbus/xenbus_xs.c b/sys/arch/xen/xenbus/xenbus_xs.c
index c09e7c4..1405807 100644
--- a/sys/arch/xen/xenbus/xenbus_xs.c
+++ b/sys/arch/xen/xenbus/xenbus_xs.c
@@ -751,7 +751,7 @@ xenwatch_thread(void *unused)
 static int
 process_msg(void)
 {
-       struct xs_stored_msg *msg;
+       struct xs_stored_msg *msg, *s_msg;
        char *body;
        int err;
 
@@ -782,7 +782,7 @@ process_msg(void)
        body[msg->hdr.len] = '\0';
 
        if (msg->hdr.type == XS_WATCH_EVENT) {
-               bool found;
+               bool found, repeated;
 
                DPRINTK("process_msg: XS_WATCH_EVENT");
                msg->u.watch.vec = split(body, msg->hdr.len,
@@ -798,12 +798,22 @@ process_msg(void)
                found = (msg->u.watch.handle != NULL);
                if (found) {
                        mutex_enter(&watch_events_lock);
-                       SIMPLEQ_INSERT_TAIL(&watch_events, msg, msg_next);
-                       cv_broadcast(&watch_cv);
+                       repeated = false;
+                       /* Don't add duplicate events to the queue of pending 
watches */
+                       SIMPLEQ_FOREACH(s_msg, &watch_events, msg_next) {
+                               if (s_msg->u.watch.handle == 
msg->u.watch.handle) {
+                                       repeated = true;
+                                       break;
+                               }
+                       }
+                       if (!repeated) {
+                               SIMPLEQ_INSERT_TAIL(&watch_events, msg, 
msg_next);
+                               cv_broadcast(&watch_cv);
+                       }
                        mutex_exit(&watch_events_lock);
                }
                mutex_exit(&watches_lock);
-               if (!found) {
+               if (!found || repeated) {
                        free(msg->u.watch.vec, M_DEVBUF);
                        free(msg, M_DEVBUF);
                }




Home | Main Index | Thread Index | Old Index