NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/47514: Multiple dump -X triggers kernel panic in fss_ioctl
>Number: 47514
>Category: kern
>Synopsis: Multiple dump -X triggers kernel panic in fss_ioctl
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 30 03:00:01 +0000 2013
>Originator: enami tsugutomo
>Release: NetBSD 6.0_STABLE
>Organization:
>Environment:
System: NetBSD rplaca.sm.sony.co.jp 6.0_STABLE NetBSD 6.0_STABLE (GENERIC) #2:
Mon Jan 7 16:53:59 JST 2013
enami%sigfpe.sm.sony.co.jp@localhost:/home/enami/src/netbsd-6/obj.amd64/sys/arch/amd64/compile/GENERIC
amd64
Architecture: x86_64
Machine: amd64
>Description:
Recently, I've updated amanda in pkgsrc (from few years old one)
and kernel starts to panic since then. It looks like the amanda
in pkgsrc is added facility to use dump -X if possilble on last
summer.
Here is the panic message and stacktrace (copied by hand):
uvm_fault(0xfffffe80bda3bd40, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff804bf1af cs 8 rflags 10283 cr2 8 cpl 0 rsp
fffffe8006d59820
kernel: page fault trap, code=0
Stopped in pid 1713.1 (dump) at netbsd:mutex_vector_enter+0x80: movq 18(%r15),
%rax
db{0}> bt
mutex_vector_enter() at netbsd:mutex_vector_enter+0x80
fss_ioctl() at netbsd:fss_ioctl+0xed
VOP_IOCTL() at netbsd:VOP_IOCTL+0x3b
vn_ioctl() at netbsd:vn_ioctl+0x76
sys_ioctl() at netbsd:sys_ioctl+0x13c
syscall() at netbsd:syscall+0xc4
db{0}>
The value of %r15 is fffffffffffffff0
With my amanda configuration, up to 8 dump will runs in parallel.
The system has two cpus.
>How-To-Repeat:
Install amanda from pkgsrc and setup to run multiple dumps in parallel.
>Fix:
I guess there is race condition between fss_open and fss_close.
Here is possible story:
A process calls fss_open while another process is calling
fss_close (since the device driver is marked as MPSAFE). In
the fss_close, no lock is held if control is between
mutex_exit(&sc->slock) and fss_ioctl(dev, FSSIOCCLR...) for
example. So, fss_open may return successfully during that.
Then the fss_close will detatch the device, before the
process which opened the fss device issues FSSIOCSET ioctl
(mutexes are destroyed and softc is freed as a result).
Later, the ioctl will be issued and it raises kernel panic.
The value of %r15 may indicate destroyed mutex.
Home |
Main Index |
Thread Index |
Old Index