Re: FS corruption because of bufio_cache pool depletion?

To: Christoph Badura <bad%bsd.de@localhost>
Subject: Re: FS corruption because of bufio_cache pool depletion?
From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Date: Tue, 26 Jan 2010 15:32:23 +0100

On Mon, Jan 25, 2010 at 10:52:03PM +0100, Christoph Badura wrote:
> I am seeing FS corruption on my development server in the source trees.
> The server is running Xen on i386 with a 128MB RAM dom0 and 256MB RAM domUs.
> I'm using netbsd-5 in the dom0 and some domUs -current in other domUs.
> 
> Typical ways to provoke corruption is rsync'ing a source tree from the
> vnd-backed xbd in a domU to local partition in the dom0 or running "cvs
> update" in the dom0 on a tree.  The most obvious damage was corrupt CVS/Root
> and directory contents.

Can you give more details on the corruption ?
Was it only directory entries that were corrupted, or did you notice
corruptions in the data block too ?
I'm seeing panic like:
bad dir ino 14212602 at offset 0: mangled entry
on NFS servers (a few times a year) and the directory is indeed
corrupted on fsck. I've seen this with both netbsd-3 and netbsd-5

> Once I got an I/O error in a domU from the xbd with the sources on it during
> a build.sh run.  At that point I noticed the following messages in the
> kernel message buffer:
> 
> raid1: IO failed after 5 retries.
> cgd1: error 5
> xbd IO domain 1: error 5

It seems raidframe doesn't do anything special for memory failure.
It returns EIO for the whole request if it can't get an entry
from bufio_cache for I/O to one component. Maybe it should wait and
retry to I/O later ?
dk(4) does this ...

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--

Follow-Ups:
- Re: FS corruption because of bufio_cache pool depletion?
  - From: Christoph Badura

References:
- FS corruption because of bufio_cache pool depletion?
  - From: Christoph Badura

Prev by Date: Re: biodone() and splbio ?
Next by Date: Re: FS corruption because of bufio_cache pool depletion?
Previous by Thread: FS corruption because of bufio_cache pool depletion?
Next by Thread: Re: FS corruption because of bufio_cache pool depletion?
Indexes:

Home | Main Index | Thread Index | Old Index