Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

wapbl lockups



I have seen two machines lock up, and some experimentation implicates
WAPBL and USB-connected big disks.  I am curious if anyone else has seen
this.

Note that wapbl does a cache flush on committing the journal
(vfs.wapbl.flush_disk_cache = 1).

system 1:

  observed with netbsd-5 around a year ago
  soekris net5501, 512M ram, 2T wd elements external usb drive (ufs2)
  this disk is known  to take a long time (0.5s) to flush the cache

  when doing rdiff-backup to the disk, or other write-heavy workloads,
  the machine froze and it appeared that all processes were in tstile.
  I believe ping still worked.

  I stopped mounting the external disk with wapbl (an internal 40G disk
  still uses it, but it's not a backup target), and the machine has been
  100% solid.

system 2:

  observed with netbsd-6 from this spring
  evbppc (p2020) with 2G ram, 3T usb disk (ufs2)

  lockups observed when doing a git clone of a huge repo.  The clone
  went ok, but the subsequent checkout precipitated the problem.
  The watchdog reset worked fine, and I'm not sure what state things
  were in.

  I remembered the net5501 issue, and after turning off wapbl the system
  is stable, completing a self-hosting build and builds of packages.


The on-disk journal should be only 64M, so that wouldn't seem to really
stress memory.  So I wonder if there is something that backs up in RAM
when there's a continuous stream of writes, and running out isn't
handled gracefully.

Attachment: pgpYtq151BeGm.pgp
Description: PGP signature



Home | Main Index | Thread Index | Old Index