tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Corosync on NetBSD



Hi all,

I tried to run corosync again on NetBSD 6 BETA. The build errors are
the same, and also it does nothing than utilizing 100% CPU. Luckily,
gdb works in live mode with threads now, and shows this:

(gdb) info th
  Id   Target Id         Frame
  5    LWP 1             0x00007f7ff68071e0 in ?? () from
/usr/lib/libpthread.so.1
  4    LWP 2             0x00007f7ff6476e5a in ___lwp_park50 () from
/usr/lib/libc.so.12
  3    LWP 3             0x00007f7ff643907a in poll () from /usr/lib/libc.so.12
  2    LWP 4             0x00007f7ff6476e5a in ___lwp_park50 () from
/usr/lib/libc.so.12
* 1    LWP 0             0x00007f7ff6476e5a in ___lwp_park50 () from
/usr/lib/libc.so.12
(gdb) thr 5
[Switching to thread 5 (LWP 1)]
#0  0x00007f7ff68071e0 in ?? () from /usr/lib/libpthread.so.1
(gdb) bt
#0  0x00007f7ff68071e0 in ?? () from /usr/lib/libpthread.so.1
#1  0x00007f7ff68075e8 in ?? () from /usr/lib/libpthread.so.1
#2  0x0000000000409947 in corosync_timer_add_duration
(nanosec_duration=1500000000, data=0x0, timer_fn=0x4049b0
<corosync_totem_stats_updater>,
    handle=0x615518) at timer.c:221
#3  0x000000000040575c in corosync_totem_stats_init () at main.c:820
#4  main_service_ready () at main.c:1410
#5  0x00007f7ff781788b in main_iface_change_fn
(context=0x7f7ff7b3c000, iface_addr=<optimized out>, iface_no=0) at
totemsrp.c:4454
#6  0x00007f7ff7809473 in timer_function_netif_check_timeout
(data=0x7f7ff7384000) at totemudp.c:1388
#7  0x00007f7ff7807780 in timerlist_expire (timerlist=0x7f7ff7b1b0d8)
at tlist.h:309
#8  poll_run (handle=150346236434579456) at coropoll.c:526
#9  0x000000000040775a in main (argc=<optimized out>, argv=<optimized
out>, envp=<optimized out>) at main.c:1846

As you can see, it jumps somewhere in libpthread from the
corosync_timer_add_duration() function, resulting in an infinite loop.
As a hack, I just commented everything out:

int corosync_timer_add_duration (
        unsigned long long nanosec_duration,
        void *data,
        void (*timer_fn) (void *data),
        timer_handle *handle)
{
/*
        int res;
        int unlock;

        if (pthread_equal (pthread_self(), expiry_thread) != 0) {
                unlock = 0;
        } else {
                unlock = 1;
                pthread_mutex_lock (&timer_mutex);
        }

        res = timerlist_add_duration (
                &timers_timerlist,
                timer_fn,
                data,
                nanosec_duration,
                handle);

        if (unlock) {
                pthread_mutex_unlock (&timer_mutex);
        }

        pthread_kill (expiry_thread, SIGUSR1);

        return (res);
*/

 return 0;
}

Doing this, the corosync service successfully starts and interacts
with its control tools.

Does anybody have an idea what could be wrong with the code obove?

Stephan


Home | Main Index | Thread Index | Old Index