[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
So, while investigating my WAPL performance problems, It looks like I can
crash the machine (not reliably, but more often that not) with a simple
seq 1 3000 | xargs mkdir
command. I get the following backtrace in ddb (wetware OCR):
panic: wapbl_register_deallocation: out of resources
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8016f01d cs 8 rflags 246 cr2 ffff80011fc2d000
cpl 0 rsp fffffe811e0fe6f0
Stopped in pid 12551.1 (mkdir) at netbsd:breakpoint+0x5: leave
breakpoint() at netbs:breakpoint+0x5
vpanic() at netbsd:vpanic+0x1f2
printf_nolog() at netbsd:printf_nolog
wapbl_register_inode() at netbsd:wapo_register_inode
ffs_truncaze() at netbsd:ffs_truncate+0x917
ufs_direnter() at netbsd:ufs_direnter+0x481
ufs_mkdir() at netbsd:ufs_mkdir+0x617
VOP_MKDIR() at netbsd:VOP_MKDIR+0x3b
do_sys_mkdir() at netbsd:do_sys_mkdir+0x10f
syscall() at netbsd:syscall+0xc4
It's unreasonable to take a dump because that would take an estimated four
to five hours. Is there any reasonable way to get a dump out of a 16G box?
On reboot, at mounting one file system (NOT the one I was operating on as
the crash happened), the "replaying log to disk" took several minutes.
I physically walked to the server to have a look whether the discs were
actually busy, and there was a strange pattern: Out of the five discs that
the RAID was built on, four were blinking at ~7Hz while the fifth was idle.
The position of the idle disc changed on a regular basis (about every two
seconds), but I could not find a pattern how it moved around. Possibly
sometimes, two discs were idle at the same time.
Any idea why that took so long? The file system in question is small.
Main Index |
Thread Index |