Subject: Re: LFS and Xen3 testing
To: None <current-users@netbsd.org, perseant@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: current-users
Date: 09/10/2006 18:13:10
Hi,
sorry for the delay replying to this
On Tue, Sep 05, 2006 at 01:56:44PM +1000, Daniel Carosone wrote:
>
> I reinstated my LFS-testing setup from a while ago. For convenience
> it seemed easier, this time around, to test on a Xen3 domU - but now
> it's not clear to me whether the problems I find are due to LFS or
> Xen. So, sorry, I'm going to mix together both.
>
> * Sometimes, all disk activity will stop, and something (usually the
> cleaner) is stuck in biowait. I suspect this to be a Xen issue.
> Dom0 is linux with LVM2 volumes for the xbd backend, domU is
> -current a day or two old. It seems most easily (or even only?)
> triggered when dom0 is busy with CPU-heavy tasks. I saw a commit
> go by recently that looked promising for something like this, but
> it doesn't seem to have helped this case.
It may indeed be a Xen issue; see port-xen/34005
>
> * if I run screen, the screen process takes 100% of the cpu, in state
> either "lfs sb" or "lfs_ioco", and can't be killed. The cleaner
> and several other things are then in "lfs segl" and the system gets
> generally unhappier from there. The whole system (including /tmp)
> is all on one root lfs, perhaps this is related to screen's socket
> usage in /tmp? It doesn't matter whether screen is run on the xm
> console or in a sshd pty. I probably wouldn't have found this if
> I'd remembered to enable tmpfs in the kernel, and I'll confirm
> whether that affects the issue.
I do use screen on Xen systems and didn't notice this issue, so I'm tempted
to blame LFS for this one :)
>
> * the kernel prints "lfs_segwrite: loopcount=2" every so often, and
> just once or twice "lfs_writeinode: looping count=2". This happens
> every few minutes as the cleaner is running after a crash (via xm
> destroy) after one of the above. If this is a diagnostic for
> something, it seems to be happening here, in case that's
> interesting.
>
> * resize_lfs produces an almost instant, repeatable panic trying to
> shrink a filesystem:
>
> panic: lfs_rescount
> [...]
> I recall this appearing to work last time I tried it, but I may not
> have had DIAGNOSTIC in that kernel, more fool me :)
Note that XEN3_DOM0 and XEN3_DOMU are built with DIAGNOSTIC and DEBUG
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--