NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/40466: endless looping in ffs_sync() on WAPBL mounts...
>Number: 40466
>Category: kern
>Synopsis: endless looping in ffs_sync() on WAPBL mounts...
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jan 24 17:35:00 +0000 2009
>Originator: Greg Oster
>Release: NetBSD 5.99.7
>Organization:
>Environment:
System: NetBSD thog 5.99.7 NetBSD 5.99.7 (MONOLITHIC) #1: Sat Jan 24 11:06:25
CST 2009 oster@quad:/u1/devel/current2/src/sys/arch/i386/compile/MONOLITHIC
i386
Architecture: i386
Machine: i386
>Description:
Given a MONOLITHIC -current kernel (with COMPAT_50) with a 5.0_BETA
userland, attempt to mount a freshly newfs'ed partition with '-o log'
options. Wonder why the machine suddenly stops responding. Add
instrumentation to ffs_sync(), and determine that it is looping
endlessly around the "loop:" label.
>How-To-Repeat:
thog# newfs /dev/rwd1f
/dev/rwd1f: 9765.6MB (20000000 sectors) block size 16384, fragment size 2048
using 53 cylinder groups of 184.27MB, 11793 blks, 23296 inodes.
super-block backups (for fsck_ffs -b #) at:
32, 377408, 754784, 1132160, 1509536, 1886912, 2264288, 2641664, 3019040,
...............................................................................
thog# fsck -f /dev/rwd1f
** /dev/rwd1f
** File system is already clean
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
1 files, 1 used, 4921974 free (14 frags, 615245 blocks, 0.0% fragmentation)
thog# mount -o log /dev/wd1f /u2
Read from remote host thog: Connection reset by peer
at this point we break into ddb, and discover that the mount_ffs
process seems to be in various places in ffs_sync() - i.e. can tell
ddb to continue, break again, and it's often in a different function.
Add instrumentation to ffs_sync(), and determine that, indeed,
ffs_sync() is looping hard through the "loop:" label.
In the above case, / was also mounted as a logging filesystem.
Surprisingly, however, I can no longer trigger this bug:
http://gnats.netbsd.org/40361
with this kernel.....????? (i.e. now having / mounted as non-log and
/u2 mounted as log works again... so at least now I can hack on
logging code without having to worry as much about wrecking / :-/ )
>Fix:
Please.
Home |
Main Index |
Thread Index |
Old Index