Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Debugging kernel memory allocation

I'm trying to track down a problem that's been getting worse as my Dell
2850 main home server has been getting more loaded down with work.  From
time to time, the machine will lock up temporarily: it doesn't respond
to ICMP ECHO, and it doesn't echo characters typed on the console.  It
will sit like that for three or four minutes, and then continue
running.  Nothing is logged, other than messages that are consequences
of the hang.

Over the last couple of days, I've been at the console twice when it's
happened, and have hit the machine's interrupt button to get it to drop
into ddb, so I could get backtraces.  In both cases, I interrupted it in
x86_pause(), where it was waiting on a spinlock during a call to
uvm_pagealloc_strat().  I thought the 'cpu' command to ddb should switch
between processors, so I could get backtraces from each, but ddb didn't
recognize that command.

Here are the two backtraces, hand copied:



The box has 8GB of RAM, and is a VLAN and VPN router, database server,
NFS server, mail server, web server, and a number of other things.  It
tends to have a load well under 1, though, and most of its RAM used as
file cache, so it's really not very heavily loaded.  It's running
NetBSD/amd64-current as per Oct 31.

I'm looking at the output of things like vmstat, systat, and others, but
I could really do with some ideas for where to look and what to look for.
My assumption is that I'm after some reason why the system should
suddenly be taking several minutes to free up some memory, when it's
obviously got much more than it needs to begin with.  :)

It doesn't matter how beautiful your theory is, it doesn't matter how smart
you are. If it doesn't agree with experiment, it's wrong.  -Richard Feynman

Home | Main Index | Thread Index | Old Index