NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: FreeRADIUS instability



    Date:        Wed, 29 Sep 2021 09:09:06 -0700
    From:        "Pawel S. Veselov" <pawel.veselov%gmail.com@localhost>
    Message-ID:  <467fc0d3-37d1-a88a-2584-420f2a06b8a1%gmail.com@localhost>

  | So we caught where the queue is closed, and traced it back to
  | getaddrinfo(). That call both closes fd#3, creates a new kqueue
  | and leaves it open. This is the back trace from close:
  |
  | #0  0x0000732d69c07892 in close () from /usr/lib/libpthread.so.1

From this I'm guessing that freeradius is multi-threaded ?

  | The full stack traces and ktraces can be found here:
  |
  | https://github.com/FreeRADIUS/freeradius-server/issues/4244

I saw some helpful data there, but hardly a full ktrace.

  | Our next step is to recompile libc with debugging symbols and start
  | poking around there to see why is it closing an fd that doesn't
  | belong to it, but if somebody knows why that might happen -
  | that'd be great.

Is it possible that something at startup is closing fds, but that might
be happening after the DNS resolver has been initialised ?

As you saw the libc address lookup routines leave the fd open, and if
something as part of a "make sure all fd's > 2 are closed at startup"
type functionality went and closed it, that would cause a problem.
The kqueue fd is used to monitor /etc/resolv.conf for any changes that
would require (or might require) it to be re-read (which is a useful
thing to do for long running daemons) - so it needs to remain open
for the life of the process.

There can also be issues with the resolver state if a multi-threaded
program isn't correctly linked with -lpthread and gets the single
threaded resolver state instead of the malloc'd version.

The fd that is being closed immediately before the kqueue() happens
isn't the interesting one - that's just from where resolv.conf was
read immediately previously (it is fopened, in your traces, that's fd 3,
then it is read (using stdio) - the file descriptor is dup'd (that's 5
in your traces) then 3 is closed (fclose()) - that part is all very
boring.  Then kqueue() is used to monitor fd 5 for any changes (the
kqueue is fd 3, the lowest available), and if any occur, the resolver
will be re-init'd (that most likely is not what is happening).

kre



Home | Main Index | Thread Index | Old Index