Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

mpii zfs xen unstable on -current

from time to time, nowadays at least once a week my dom0 panics.
Which unfortunately also terminates all domU’s running on that machine.  And that is not the desired state of things for me.

[ 236025.4127414] panic: kernel diagnostic assertion "xs->resid == xs->datalen" failed: file "/hurz/src/sys/dev/pci/mpii.c", line 3207 
[ 236025.4127414] cpu0: Begin traceback...
[ 236025.4227374] vpanic() at netbsd:vpanic+0x177
[ 236025.4227374] kern_assert() at netbsd:kern_assert+0x4b
[ 236025.4227374] mpii_scsi_cmd_done() at netbsd:mpii_scsi_cmd_done+0x30b
[ 236025.4227374] mpii_intr() at netbsd:mpii_intr+0x21e
[ 236025.4227374] evtchn_do_event() at netbsd:evtchn_do_event+0x114
[ 236025.4227374] do_hypervisor_callback() at netbsd:do_hypervisor_callback+0x167
[ 236025.4327364] Xhandle_hypervisor_callback() at netbsd:Xhandle_hypervisor_callback+0x19
[ 236025.4327364] --- interrupt ---
[ 236025.4327364] hypercall_page() at netbsd:hypercall_page+0x3aa
[ 236025.4327364] idle_loop() at netbsd:idle_loop+0x146
[ 236025.4327364] cpu0: End traceback...

[ 236025.4327364] dumping to dev 168,9 (offset=33482590, size=0): not possible
[ 236025.4327364] rebooting...
(XEN) Hardware Dom0 shutdown: rebooting machine

I do have another machine with the same controller running -current rock-solid, but directly ont he hardware, no hypervisor involved, but running and booting from zfs.

The machine i want to fix is running NetBSD 9.99.97 (XEN3_DOM0) #4: Thu Jun 16 13:02:43 CEST 2022  built from sources on that same date.  This happened before with older -currents, so i suspect this is not a -current problem, but something with either xen oder zfs. 
the dom0 us running off a disk on the ciss controller - hardware raid - , but all the domUs have their virtual disks on files on a zfs filesystems.  I also happen to run builds for the virtual machines in the dom0, and that is on a separate zfs filesystem.

Is there any way to find out if the crash is caused by a domU or happens in the dom0?
Should i use zvols instead of files?
should I not use zfs at all?
is there a better IT-mode cotroller for the HPE DL380g8?
would it be fine to use the virtual disks of the ciss0 controller as zfs pool members?
Or should i switch to something else for dom0?

ciss0 at pci5 dev 0 function 0: HP Smart Array 12
ciss0: interrupting at msix5 vec 0
ciss0: 3 LDs, HW rev 1, FW 8.00/8.00, 64bit fifo rro, method perf 0x20000005
scsibus1 at ciss0: 3 targets, 1 lun per target
ciss0: normal state on 'ciss0:0' (online)
ciss0: normal state on 'ciss0:1' (online)
ciss0: normal state on 'ciss0:2' (online)

mpii0 at pci1 dev 0 function 0: Symbios Logic SAS2308 (rev. 0x05)
mpii0: interrupting at msix0 vec 0
mpii0: H220, firmware, MPI 2.0

Any help or pointer appreciated...


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Home | Main Index | Thread Index | Old Index