Subject: bind8 "forward first" buggy was Re: How can this be done?
To: None <netbsd-help@netbsd.org>
From: Danny Thomas <D.Thomas@its.uq.edu.au>
List: netbsd-help
Date: 06/14/2001 08:48:14
>>       Alternatively you could run a dns server on amiga, and point
>>       both machines at amiga in resolv.conf.
>
>This is what I do on my gateway machine.  As part of the PPP ip-up script,
>I generate a /etc/named.conf.$LOC with the correct name servers, for example
>(I'm currently connected via demon.net, so LOC=demon) the forwarding part is :
>
>        options {
>          forward first;
>          forwarders {
>            158.152.1.58;
>            158.152.1.43;
>          };
>        };

it may not be a problem in your environment, but our campus nameservers
recently got badly corrupted caches when configured with "forward first"
[which would provide useful redundancy to us because our main nameservers
were first trying to forward through one with a separate external link]

so we'd want people to be aware of this problem potential from using
"forward first" with the current versions of bind8 (9.2 is the first bind9
release we could consider as a suitable upgrade)

tens of thousands of log msgs per day like
May 16 00:05:32 krefti named[13030]: sysquery: no addrs found for root NS (NS1)
May 16 00:05:32 krefti named[13030]: sysquery: no addrs found for root NS (NS2)
May 16 00:05:32 krefti named[13030]: sysquery: no addrs found for root NS (NS3)
May 16 00:05:32 krefti named[13030]: sysquery: no addrs found for root NS ()

these would last for intervals of a few seconds to a few minutes. They do
not come from a bad hints file, nor from connectivity problems
NB while "NS1", "NS2", "NS3" & "" were the most common, a variety of domain
names appeared in these log msgs

this was one symptom: thousands of records in the cache including ones like
in-addr.arpa had wrong NS records

an australian banking site is particularly prone to these problems because
they run with foolishly short TTLs and we had numerous reports from people
unable to connect to their site, corresponding to cache entries like

  $ORIGIN anz.com.
  ;www    5023    IN      SOA     ns1.hi2000.net. hostmaster.hi2000.net. (
  ;               2830536819 10800 3600 604800 86400 );com.;NXDOMAIN      ;-$
  ;Cr=auth [211.90.223.103]

yep, ns1.hi2000.net not only claimed at that time to be authoritative for
.com, but records from it made it into the cache.

this problem happened with both 8.2.3 & 8.2.4 - an official bug report was
made late in the 8.2.4 release cycle, so we weren't surprised a fix didn't
make it in


cheers,
Danny Thomas