NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Bind ending up in Parked state.



In article <CABYHU95LFHrVNqszfXzdh3p1R=B01xfyi-Y_OCMHcLuXBa+8jw%mail.gmail.com@localhost>,
Søren P. Skou  <sps%t-rex.dk@localhost> wrote:
>-=-=-=-=-=-
>
>Hi there,
>
>I'm currently investigating one of my Nameservers behvaviour. Recently I
>switched from a pure virtual setup, to having 3 physical machines, each
>connected to their own router. Each machine is installed with NetBSD 6.1.5
>(Generic) amd64, bind-9.10.1pl1 and exabgp-3.3.2nb1.
>
>The setup is such that bind is listening on aliases bound to lo0 and
>127.0.0.1. Exabgp announces the 3 IP adresses with different local_pref for
>each for the machines, to ensure that all servers are accessible at any
>given time should one of the physical servers fail in any way, or if bind
>gives up on resolving things. ExaBGP takes good care of this and that
>particular part is running quite well.
>
>Now, for 1 of the 3 physical servers, Bind ends up in "parked" state after
>a while. There seems to be little to no explanation why. At first I thought
>this was due to a hardware error, so I replaced the hardware. The new
>hardware did exactly the same.
>
>>From what I can read about the "parked" state is that it is currently
>waiting for some resource, and will not move on, all signals apart from
>"SIGKILL" will be queued and this is also the behaviour I see,
>/etc/rc.d/named9 restart takes forever as it is waiting for bind's pid to
>end. After a kill, it starts up nicely again - runs for a while, then dies.
>
>Currently I've put in a rather ugly hack combining sudo, kill from exabgp.
>Restarting of named from crontab every 15 minutes, and that "works"(ish).
>
>But I would rather not have this problem with a parked bind. This was never
>a problem on the virtual setup, here was other issues though, but not a
>complete halt of service. This is a rather busy nameserver. I cannot get it
>to fail under no load, here it just keeps on running.
>
>My question is, have anyone experienced this? Or alternatively, anyone who
>has an idea as to where to look for what resource it is waiting for?
>

How many CPU's does the machine have? If you run:

http://www.netbsd.org/~christos/race.c

does it get stuck after a while?

christos



Home | Main Index | Thread Index | Old Index