Subject: Re: BIND for secondary zone dumps core.
To: NetBSD-current Discussion List <current-users@netbsd.org>
From: Greywolf <greywolf@starwolf.com>
List: current-users
Date: 07/08/2001 21:43:04
Sorry, this is getting long, but I'm baffled.

Here's what I got:

starwolf 650# gdb /usr/sbin/named named.core
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc--netbsd"...
Core was generated by `named'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/libexec/ld.elf_so...done.
Reading symbols from /usr/lib/libc.so.12...done.
#0  0x39a14 in ns_resp (msg=0xefffed68 "\027\037\204\200", msglen=187, from={
      sin_len = 16 '\020', sin_family = 2 '\002', sin_port = 53, sin_addr = {
        s_addr = 3499314287}, sin_zero = "\000\000\000\000\000\000\000"},
    qsp=0x0)
    at /export/src/usr.sbin/bind/named/../../../dist/bind/bin/named/ns_resp.c:459
459                     if (ina_equal(fwd->fwddata->fwdaddr.sin_addr, from.sin_addr))
(gdb) print *fwd
$1 = {next = 0x5001085, fwddata = 0x39a14}
(gdb) print fwd
$2 = (struct fwdinfo *) 0x0
(gdb) 

What is up with that?!?  I should not be able to reference fwd->next if
fwd is (struct fwdinfo *) NULL!

On Sun, 8 Jul 2001, Greg A. Woods wrote:

# > Here's the trace:
# > 
# > #0 0x39a14 in ns_resp (msg=0xefffed68 "> s_addr = 3499314287}, sin_zero =
# >     "\000\000\000\000\000\000\000"}, qsp=0x0)
# >     at /export/src/usr.sbin/bind/named/../../../dist/bind/bin/named/ns_resp.c:459
# > #1 0x2c6e8 in dispatch_message (msg=0xefffed68 "> sin_port = 53, sin_addr
# >       = {s_addr = 3499314287}, sin_zero = "\000\000\000\000\000\000\000"},
# >       dfd=4, ifp=0xefffed8b)
# >     at /export/src/usr.sbin/bind/named/../../../dist/bind/bin/named/ns_main.c:1160
# > #2 0x2c424 in datagram_read (lev={opaque = 0x106000}, uap=0x0, fd=4,
# > evmask=1)
# >     at /export/src/usr.sbin/bind/named/../../../dist/bind/bin/named/ns_main.c:1102
# > #3 0x6430c in __evDispatch (opaqueCtx={opaque = 0xc55b8}, opaqueEv={
# >       opaque = 0x1})
# >     at /export/src/usr.sbin/bind/lib/../../../dist/bind/lib/isc/eventlib.c:487
# > #4 0x2ad90 in main (argc=770048, argv=0xbc000, envp=0x84400)
# >     at /export/src/usr.sbin/bind/named/../../../dist/bind/bin/named/ns_main.c:552
# > #5  0x12138 in ___start ()
# > 
# > I rather suspect that qsp=0x0 is causing a problem :-)
# 
# No, not necessarily.  'qsp' is never dereferenced in ns_resp()
# 
# What does "print *fwd" show?
#  
# (I'm betting fwd->fwddata is either null or pointing off into hyperspace)
# 
# It seems as though it's choking on some packet it received....
# 
# It's very good that you've got a repeatable scenario here!
# 
# You might want to capture all the port-53 related traffic surrounding a
# crash too....
# 
# I think there's still time to get a fix into 8.2.5!  ;-)
# (8.2.5-T1A was announced on June 23)
# 
# -- 
# 							Greg A. Woods
# 
# +1 416 218-0098      VE3TCP      <gwoods@acm.org>     <woods@robohack.ca>
# Planix, Inc. <woods@planix.com>;   Secrets of the Weird <woods@weird.com>
# 


				--*greywolf;
--
If your server could choose, it would choose NetBSD.