Subject: Re: make -j 3 hang on amd64
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Kurt Schreiner <ks@ub.uni-mainz.de>
List: current-users
Date: 11/29/2005 21:31:06
On Tue, Nov 29, 2005 at 09:22:32PM +0100, Manuel Bouyer wrote:
> On Tue, Nov 29, 2005 at 04:40:56PM +0100, Kurt Schreiner wrote:
> > Hi,
> > 
> > I just tried to build a distribution for vax on my dual-opteron system.
> > This work for a while and then stalled. /usr/src is a symbolic link
> > to /u/NetBSD/src which is a umion mount:
> > 
> > /dev/sd0a on / type ffs (local)
> > /dev/sd0i on /var type ffs (noatime, local)
> > /dev/sd0h on /usr type ffs (noatime, local)
> > /dev/sd0j on /opt type ffs (noatime, local)
> > /dev/sd0k on /home type ffs (noatime, soft dependencies, local)
> > /dev/sd1h on /u type ffs (noatime, soft dependencies, NFS exported, local)
> > tmpfs on /tmp type tmpfs (nosuid, nodev, local)
> > kernfs on /kern type kernfs (local)
> > <above>:/u/NetBSD/lpkgsrc on /u/NetBSD/pkgsrc type union (nosuid, nodev, local, mounted by ks)
> > <above>:/u/NetBSD/lsrc on /u/NetBSD/src type union (nosuid, nodev, local, mounted by ks)
> > 
> > The process table shows some processes hanging in "D" state:
> > ("df" and "ls" tried after the make didn't "answer" anymore)
> > 
> >  77   692   240     0  18  0 2344  1716 pause    Ss   ttyp0 0:00.06 -tcsh 
> >  77  6335   692     0  28  0  120   844 -        R+   ttyp0 0:00.00 ps axl 
> >  77 14416   692     0  -2  0   60   696 vnlock   D    ttyp0 0:00.00 df -k 
> >  77 17940   692     0  -2  0   72   820 vnlock   D    ttyp0 0:00.00 ls -CF -a -ol /usr/src/ 
> >  77   286 19261 52645  -2  0  204   944 vnlock   D    ttyp1 0:00.00 /u/NetBSD/arch/vax/TOOLS/bin/nb
> >  77   334  6579 52645  -2  0  664  1448 vnlock   D    ttyp1 0:00.01 /u/NetBSD/arch/vax/TOOLS/bin/nb
> >  77   610 15349 52645  -2  0  204   944 vnlock   D    ttyp1 0:00.00 /u/NetBSD/arch/vax/TOOLS/bin/nb
> 
> You don't say what release you're running on this system.
> I've seen this a lot on NetBSD-2.x SMP systems with null mounts. I've not seen
> them since I upgraded to 3.0_RCx.
Hmpf! Sorry! I'm running -current on this machine, using a SMP-kernel.
>-87: uname -a 
NetBSD sunopti 3.99.12 NetBSD 3.99.12 (SUNOPTI_MP) #7: Tue Nov 29 21:01:45 MET 2005  ks@sunopti:/u/NetBSD/arch/amd64/obj/sys/arch/amd64/compile/SUNOPTI_MP amd64

This kernel is some hours newer than the one I had the hangs above with.
I'll try reproduce this tomorrow, but there's no shure method to trigger
this hangs. Bevor the hang I compiled userland for amd64, i386 and alpha
with the same set of commands w/o any problems...

Kurt