Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

qemu emulated machine crashes due to disk timeouts



Since I have my qemu disk images on slow spinning rust host disks, when the
host disk is busy (esp. daily+security runs), I find my qemu vm's see disk
timeouts, and end up crashing. This isn't great behaviour.

Is anyone else seeing this? Any workarounds? I haven't looked, but I assume
I'd be able to just bump the timeouts and/or retries.

Host is amd64, netbsd-9 from the last few days.
qemu install is amd64, netbsd-current in the last day or two.

[ 13284.4654620] wd0: soft error (corrected) xfer 38
piixide0:0:0: lost interrupt
[ 13356.4860552]        type: ata tc_bcount: 65536 tc_skip: 0
piixide0:0:0: bus-master DMA error: missing interrupt, status=0x21
[ 13357.4631898] wd0: transfer error, downgrading to PIO mode 4
[ 13357.4631898] wd0b: DMA error writing fsbn 8552 of 8552-8679 (wd0 bn 3940775; cn 1924 tn 13 sn 7), xfer 38, retry 0
piixide0:0:0: lost interrupt
[ 13379.0527863]        type: ata tc_bcount: 16384 tc_skip: 49152
[ 13379.0527863] wd0b: device timeout writing fsbn 8648 of 8552-8679 (wd0 bn 3940871; cn 1924 tn 16 sn 7), xfer 38, retry 1
[ 13390.3089543] piixide0:0:0: timeout waiting for DRQ, st=0xd0, err=0x00
[ 13390.3238208] wd0b: device timeout writing fsbn 8552 of 8552-8679 (wd0 bn 3940775; cn 1924 tn 13 sn 7), xfer 38, retry 2
[ 13400.6328270] piixide0:0:0: timeout waiting for DRQ, st=0xd0, err=0x00
[ 13400.6328270] wd0b: device timeout writing fsbn 8552 of 8552-8679 (wd0 bn 3940775; cn 1924 tn 13 sn 7), xfer 38, retry 3
piixide0:0:0: lost interrupt
[ 13422.9312686]        type: ata tc_bcount: 65024 tc_skip: 49664
[ 13422.9312686] wd0b: device timeout writing fsbn 8553 of 8552-8679 (wd0 bn 3940776; cn 1924 tn 13 sn 8), xfer 38, retry 4
piixide0:0:0: lost interrupt
[ 13443.7290708]        type: ata tc_bcount: 65536 tc_skip: 49664
[ 13443.7290708] wd0b: device timeout writing fsbn 8552 of 8552-8679 (wd0 bn 3940775; cn 1924 tn 13 sn 7)
[ 13443.7290708] wd0b: error writing fsbn 8552 of 8552-8679 (wd0 bn 3940775; cn 1924 tn 13 sn 7)
[ 13461.1443717] piixide0:0:0: timeout waiting for DRQ, st=0x50, err=0x00
[ 13461.1443717] wd0b: device timeout writing fsbn 8720 of 8704-8831 (wd0 bn 3940943; cn 1924 tn 18 sn 15), xfer 38, retry 0
[ 13472.4372217] piixide0:0:0: timeout waiting for DRQ, st=0x50, err=0x00
[ 13472.4469933] wd0b: device timeout writing fsbn 8704 of 8704-8831 (wd0 bn 3940927; cn 1924 tn 17 sn 31), xfer 38, retry 1
[ 13482.7636985] piixide0:0:0: timeout waiting for DRQ, st=0x50, err=0x00
[ 13482.7636985] wd0b: device timeout writing fsbn 8704 of 8704-8831 (wd0 bn 3940927; cn 1924 tn 17 sn 31), xfer 38, retry 2
[ 13493.0950195] piixide0:0:0: timeout waiting for DRQ, st=0x50, err=0x00
[ 13493.1011057] wd0b: device timeout writing fsbn 8704 of 8704-8831 (wd0 bn 3940927; cn 1924 tn 17 sn 31), xfer 38, retry 3
[ 13493.1166960] fatal page fault in supervisor mode
[ 13493.1166960] trap type 6 code 0 rip 0xffffffff8021f05f cs 0x8 rflags 0x246 cr2 0xffffc98002326000 ilevel 0x8 rsp 0xffffffff81ae8b60
[ 13493.1166960] curlwp 0xffffffff8165f7c0 pid 0.1 lowest kstack 0xffffffff81ae42c0
[ 13493.1166960] panic: trap
[ 13493.1166960] cpu0: Begin traceback...
[ 13493.1166960] vpanic() at netbsd:vpanic+0x178
[ 13493.1166960] snprintf() at netbsd:snprintf
[ 13493.1166960] startlwp() at netbsd:startlwp
[ 13493.1166960] alltraps() at netbsd:alltraps+0xc3
[ 13493.1166960] wdc_ata_bio_start() at netbsd:wdc_ata_bio_start+0xcbd
[ 13493.1166960] ata_xfer_start() at netbsd:ata_xfer_start+0x4f
[ 13493.1166960] wdc_ata_bio_intr() at netbsd:wdc_ata_bio_intr+0x3b9
[ 13493.1166960] wdcintr() at netbsd:wdcintr+0x10a
[ 13493.1166960] intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x36
[ 13493.1166960] Xhandle_ioapic_edge3() at netbsd:Xhandle_ioapic_edge3+0x6d
[ 13493.1166960] --- interrupt ---
[ 13493.1166960] Xspllower() at netbsd:Xspllower+0xe
[ 13493.1166960] sched_pstats() at netbsd:sched_pstats+0x1c2
[ 13493.1166960] uvm_scheduler() at netbsd:uvm_scheduler+0xb4
[ 13493.1166960] main() at netbsd:main+0x7a5
[ 13493.1166960] cpu0: End traceback...
[ 13493.1166960] Mutex error: mutex_vector_enter,509: assertion failed: !cpu_intr_p()

[ 13493.1166960] lock address : 0xffffb6a0a62590c0 type     :     sleep/adaptive
[ 13493.1166960] initialized  : 0xffffffff809d31d4
[ 13493.1166960] shared holds :                  0 exclusive:                  1
[ 13493.1166960] shares wanted:                  0 exclusive:                  0
[ 13493.1166960] relevant cpu :                  0 last held:                  0
[ 13493.1166960] relevant lwp : 0xffffffff8165f7c0 last held: 0xffffffff8165f7c0
[ 13493.1166960] last locked* : 0xffffffff809ee1a0 unlocked : 0xffffffff809f7182
[ 13493.1166960] owner field  : 0xffffffff8165f7c0 wait/spin:                0/0
[ 13493.1166960] Turnstile: no active turnstile for this lock.

[ 13493.1166960] Skipping crash dump on recursive panic
[ 13493.1166960] panic: LOCKDEBUG: Mutex error: mutex_vector_enter,509: assertion failed: !cpu_intr_p()
[ 13493.1166960] cpu0: Begin traceback...
[ 13493.1166960] vpanic() at netbsd:vpanic+0x178
[ 13493.1166960] snprintf() at netbsd:snprintf
[ 13493.1166960] lockdebug_more() at netbsd:lockdebug_more
[ 13493.1166960] mutex_enter() at netbsd:mutex_enter+0x656
[ 13493.1166960] suspendsched() at netbsd:suspendsched+0x19
[ 13493.1166960] cpu_reboot() at netbsd:cpu_reboot+0x46
[ 13493.1166960] sys_reboot() at netbsd:sys_reboot
[ 13493.1166960] vpanic() at netbsd:vpanic+0x181
[ 13493.1166960] snprintf() at netbsd:snprintf
[ 13493.1166960] startlwp() at netbsd:startlwp
[ 13493.1166960] alltraps() at netbsd:alltraps+0xc3
[ 13493.1166960] wdc_ata_bio_start() at netbsd:wdc_ata_bio_start+0xcbd
[ 13493.1166960] ata_xfer_start() at netbsd:ata_xfer_start+0x4f
[ 13493.1166960] wdc_ata_bio_intr() at netbsd:wdc_ata_bio_intr+0x3b9
[ 13493.1166960] wdcintr() at netbsd:wdcintr+0x10a
[ 13493.1166960] intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x36
[ 13493.1166960] Xhandle_ioapic_edge3() at netbsd:Xhandle_ioapic_edge3+0x6d
[ 13493.1166960] --- interrupt ---
[ 13493.1166960] Xspllower() at netbsd:Xspllower+0xe
[ 13493.1166960] sched_pstats() at netbsd:sched_pstats+0x1c2
[ 13493.1166960] uvm_scheduler() at netbsd:uvm_scheduler+0xb4
[ 13493.1166960] main() at netbsd:main+0x7a5
[ 13493.1166960] cpu0: End traceback...

-- 
Paul Ripke
"Great minds discuss ideas, average minds discuss events, small minds
 discuss people."
-- Disputed: Often attributed to Eleanor Roosevelt. 1948.


Home | Main Index | Thread Index | Old Index