Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

deadlock copying files



Hello,

I have had problems on my -current NetBSD server for almost a year, the
symptom being usually that the machine is completely unresponsive after
the weekly run. Sometimes it has a month or two of uptime, but then it
crashes again. Sometimes it panics and reboots, but mostly it is so
locked up that even caps lock light on PS/2 keyboard does not work and
it has to be switched off.

Just now I could finally reproduce the problem so that I have something
reportable.

This is NetBSD/amd64 5.99.55 from Sep 14th (I glanced at gnats but didn't
see any fixed or open PRs obviously related to this):

 - I ran weekly.local from console, with top running in other window

 - weekly.local does backups using pax, and it proceeded to copy tons
   of photos from one disk to another (both are local, both are ffs+wabpl,
   on different wd devices)

 - top showed free memory decreasing slowly. Before starting weekly.local
   the machine had 1.5Gb free memory. Last display in top was 40k
   free, and then the machine locked up.

At this point the machine was still somewhat responsive and I could break to 
ddb.
Backtrace indicated that both cpus were completely idle. Perhaps interesting
things from ps:

 - the pax process doing the backup was waiting on uvn_fp1

 - various fs related services (samba for example) were waiting for
   flt_noram5 or something

 - ioflush was waiting for tstile

 - pgdaemon was waiting for emergva

I tried the procudere twice in a row, with identical results. The machine has
to be up now, but I can try to lock it up again during weekend and send
better details if someone tells me what to look for in ddb.

  Arto (please CC me)


Home | Main Index | Thread Index | Old Index