NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/45479: Lock error panic during RabbitMQ running



>Number:         45479
>Category:       kern
>Synopsis:       Lock error panic during RabbitMQ running
>Confidential:   yes
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 17 08:30:00 +0000 2011
>Originator:     KOGULE Ryo
>Release:        NetBSD 5.99.56, Oct 13 15:13:53 JST 2011
>Organization:
>Environment:
System: NetBSD mq01.gnavi.co.jp 5.99.56 NetBSD 5.99.56 (GNAVI) #0: Thu Oct 13 
16:18:45 JST 2011 
nbsddev%work02.gnavi.co.jp@localhost:/home/nbsddev/distrib/obj/sys/arch/amd64/compile/GNAVI
 amd64
Architecture: x86_64
Machine: amd64
Kernel configuration differences from GENERIC:
$ diff -ud sys/arch/amd64/conf/GENERIC sys/arch/amd64/conf/GNAVI
--- sys/arch/amd64/conf/GENERIC 2011-10-11 10:28:03.000000000 +0900
+++ sys/arch/amd64/conf/GNAVI   2011-10-16 15:50:51.000000000 +0900
@@ -190,18 +190,18 @@
 #options       IPFILTER_DEFAULT_BLOCK  # block all packets by default
 #options       TCP_DEBUG       # Record last TCP_NDEBUG packets with SO_DEBUG

-#options       ALTQ            # Manipulate network interfaces' output queues
-#options       ALTQ_BLUE       # Stochastic Fair Blue
-#options       ALTQ_CBQ        # Class-Based Queueing
-#options       ALTQ_CDNR       # Diffserv Traffic Conditioner
-#options       ALTQ_FIFOQ      # First-In First-Out Queue
-#options       ALTQ_FLOWVALVE  # RED/flow-valve (red-penalty-box)
-#options       ALTQ_HFSC       # Hierarchical Fair Service Curve
-#options       ALTQ_LOCALQ     # Local queueing discipline
-#options       ALTQ_PRIQ       # Priority Queueing
-#options       ALTQ_RED        # Random Early Detection
-#options       ALTQ_RIO        # RED with IN/OUT
-#options       ALTQ_WFQ        # Weighted Fair Queueing
+options        ALTQ            # Manipulate network interfaces' output queues
+options        ALTQ_BLUE       # Stochastic Fair Blue
+options        ALTQ_CBQ        # Class-Based Queueing
+options        ALTQ_CDNR       # Diffserv Traffic Conditioner
+options        ALTQ_FIFOQ      # First-In First-Out Queue
+options        ALTQ_FLOWVALVE  # RED/flow-valve (red-penalty-box)
+options        ALTQ_HFSC       # Hierarchical Fair Service Curve
+options        ALTQ_LOCALQ     # Local queueing discipline
+options        ALTQ_PRIQ       # Priority Queueing
+options        ALTQ_RED        # Random Early Detection
+options        ALTQ_RIO        # RED with IN/OUT
+options        ALTQ_WFQ        # Weighted Fair Queueing

 # These options enable verbose messages for several subsystems.
 # Warning, these may compile large string tables into the kernel!
@@ -1179,8 +1179,8 @@
 #options       RND_COM                 # use "com" randomness as well (BROKEN)
 pseudo-device  clockctl                # user control of clock subsystem
 pseudo-device  ksyms                   # /dev/ksyms
-#pseudo-device pf                      # PF packet filter
-#pseudo-device pflog                   # PF log if
+pseudo-device  pf                      # PF packet filter
+pseudo-device  pflog                   # PF log if
 pseudo-device  lockstat                # lock profiling
 pseudo-device  bcsp                    # BlueCore Serial Protocol
 pseudo-device  btuart                  # Bluetooth HCI UART (H4)
RabbitMQ status:
# rabbitmqctl status
Status of node rabbit@mq01 ...
[{pid,2130},
 {running_applications,
     [{rabbitmq_management,"RabbitMQ Management Console","2.6.1"},
      {webmachine,"webmachine","1.7.0-rmq2.6.1-hg0c4b60a"},
      {rabbitmq_management_agent,"RabbitMQ Management Agent","2.6.1"},
      {amqp_client,"RabbitMQ AMQP Client","2.6.1"},
      {rabbit,"RabbitMQ","2.6.1"},
      {os_mon,"CPO  CXC 138 46","2.2.6"},
      {sasl,"SASL  CXC 138 11","2.1.9.4"},
      {rabbitmq_mochiweb,"RabbitMQ Mochiweb Embedding","2.6.1"},
      {mochiweb,"MochiMedia Web Server","1.3-rmq2.6.1-git9a53dbd"},
      {inets,"INETS  CXC 138 49","5.6"},
      {mnesia,"MNESIA  CXC 138 12","4.4.19"},
      {stdlib,"ERTS  CXC 138 10","1.17.4"},
      {kernel,"ERTS  CXC 138 10","2.14.4"}]},
 {os,{unix,netbsd}},
 {erlang_version,
     "Erlang R14B03 (erts-5.8.4) [source] [64-bit] [smp:8:8] [rq:8] 
[async-threads:30] [hipe] [kernel-poll:true]\n"},
 {memory,
     [{total,85407792},
      {processes,53008480},
      {processes_used,52998160},
      {system,32399312},
      {atom,1336577},
      {atom_used,1312058},
      {binary,5295544},
      {code,14412198},
      {ets,6762784}]}]
...done.
Erlang links:
# ldd /usr/pkg/lib/erlang/erts-5.8.4/bin/beam.smp
/usr/pkg/lib/erlang/erts-5.8.4/bin/beam.smp:
        -lutil.7 => /usr/lib/libutil.so.7
        -lgcc_s.1 => /lib/libgcc_s.so.1
        -lc.12 => /usr/lib/libc.so.12
        -lm.0 => /usr/lib/libm.so.0
        -lcurses.7 => /usr/lib/libcurses.so.7
        -lterminfo.1 => /usr/lib/libterminfo.so.1
        -lpthread.1 => /usr/lib/libpthread.so.1
# ldd /usr/pkg/lib/erlang/erts-5.8.4/bin/inet_gethost
/usr/pkg/lib/erlang/erts-5.8.4/bin/inet_gethost:
        -lutil.7 => /usr/lib/libutil.so.7
        -lgcc_s.1 => /lib/libgcc_s.so.1
        -lc.12 => /usr/lib/libc.so.12
        -lm.0 => /usr/lib/libm.so.0
>Description:
NetBSD/amd64 which runs RabbitMQ <http://www.rabbitmq.com/> reboots
periodically.  The cycle is once to several per a day.  It seems to have
enough resources such as memory, cpu times et al.  The operating system
reboots silenty at most of times, but we had a luck to get panic messages
at /var/log/message once.  They are:

Oct 15 14:34:08 mq01 /netbsd: panic: lock error
Oct 15 14:34:08 mq01 /netbsd: cpu4: Begin traceback...
Oct 15 14:34:08 mq01 /netbsd: printf_nolog() at netbsd:printf_nolog
Oct 15 14:34:08 mq01 /netbsd: lockdebug_abort() at netbsd:lockdebug_abort+0x3a
Oct 15 14:34:08 mq01 /netbsd: mutex_vector_enter() at 
netbsd:mutex_vector_enter+0x438
Oct 15 14:34:08 mq01 /netbsd: fd_close() at netbsd:fd_close+0x8f
Oct 15 14:34:08 mq01 /netbsd: fd_getfile() at netbsd:fd_getfile+0xb4
Oct 15 14:34:08 mq01 /netbsd: kqueue_register() at netbsd:kqueue_register+0x247
Oct 15 14:34:08 mq01 /netbsd: kevent1() at netbsd:kevent1+0x157
Oct 15 14:34:08 mq01 /netbsd: sys___kevent50() at netbsd:sys___kevent50+0x33
Oct 15 14:34:08 mq01 /netbsd: syscall() at netbsd:syscall+0xac
Oct 15 14:34:08 mq01 /netbsd: cpu4: End traceback...

We could send the core (over 100MB) if necessary.
>How-To-Repeat:
>Fix:



Home | Main Index | Thread Index | Old Index