Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

ZFS disaster on -current



Hi,

On

NetBSD ymir 9.99.68 NetBSD 9.99.68 (GENERIC) #1: Tue Jun 23 22:53:46
BST 2020  sysbuild@ymir:/home/sysbuild/amd64/obj/home/sysbuild/src/sys/arch/amd64/compile/GENERIC
amd64

I suddenly got a panic with ZFS; it took place with the previous
kernel, so it was something with the module. In single user I disabled
zfs in /etc/rc.conf and was able to complete boot, but obviously
without my two pools.

'modload solaris' didn't show any problem.

I set aside the contents of /etc/zfs and did 'modload zfs', which resulted in:

.....

WARNING: ZFS on NetBSD is under development
pool redzone disabled for 'zio_buf_4096'
pool redzone disabled for 'zio_data_buf_4096'
pool redzone disabled for 'zio_buf_8192'
pool redzone disabled for 'zio_data_buf_8192'
pool redzone disabled for 'zio_buf_16384'
pool redzone disabled for 'zio_data_buf_16384'
pool redzone disabled for 'zio_buf_32768'
pool redzone disabled for 'zio_data_buf_32768'
pool redzone disabled for 'zio_buf_65536'
pool redzone disabled for 'zio_data_buf_65536'
pool redzone disabled for 'zio_buf_131072'
pool redzone disabled for 'zio_data_buf_131072'
pool redzone disabled for 'zio_buf_262144'
pool redzone disabled for 'zio_data_buf_262144'
pool redzone disabled for 'zio_buf_524288'
pool redzone disabled for 'zio_data_buf_524288'
pool redzone disabled for 'zio_buf_1048576'
pool redzone disabled for 'zio_data_buf_1048576'
pool redzone disabled for 'zio_buf_2097152'
pool redzone disabled for 'zio_data_buf_2097152'
pool redzone disabled for 'zio_buf_4194304'
pool redzone disabled for 'zio_data_buf_4194304'
pool redzone disabled for 'zio_buf_8388608'
pool redzone disabled for 'zio_data_buf_8388608'
pool redzone disabled for 'zio_buf_16777216'
pool redzone disabled for 'zio_data_buf_16777216'

I have no idea what that means, it is a first for me, ZFS otherwise
has been very reliable on this hardware so far, inasmuch as I have the
mercurial repo on a zfs and build from it from time to time (the panic
is from the last cvs update from yesterday, though).

Subsequent 'zpool import' repeated the panic (without getting me into
the debugger, though):


ZFS filesystem version: 5
uvm_fault(0xffffa97e4c3e1610, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip 0xffffffff81d49882 cs 0x8 rflags 0x10286 cr2
0xa0 ilevel 0 rsp 0xffffde819c16d760
curlwp 0xffffa97e3a41e140 pid 17394.17394 lowest kstack 0xffffde819c16a2c0
panic: trap
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x152
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0xc3
vdev_open() at zfs:vdev_open+0x9e
vdev_open_children() at zfs:vdev_open_children+0x39
vdev_root_open() at zfs:vdev_root_open+0x33
vdev_open() at zfs:vdev_open+0x9e
spa_load() at zfs:spa_load+0x38e
spa_tryimport() at zfs:spa_tryimport+0x86
zfs_ioc_pool_tryimport() at zfs:zfs_ioc_pool_tryimport+0x41
zfsdev_ioctl() at zfs:zfsdev_ioctl+0x8c1
nb_zfsdev_ioctl() at zfs:nb_zfsdev_ioctl+0x38
VOP_IOCTL() at netbsd:VOP_IOCTL+0x44
vn_ioctl() at netbsd:vn_ioctl+0xa5
sys_ioctl() at netbsd:sys_ioctl+0x550
syscall() at netbsd:syscall+0x26e
--- syscall (number 54) ---
netbsd:syscall+0x26e:
cpu0: End traceback...

The above panic did not leave a crash dump.

When I had /etc/zfs populated before, I also got a crash dump (with
'reboot 0x104'), as follows:

# crash -M netbsd.18.core -N netbsd.18
Crash version 9.99.68, image version 9.99.68.
crash: _kvm_kvatop(0)
Kernel compiled without options LOCKDEBUG.
System panicked: reboot forced via kernel debugger
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NARCNET() at 0
_KERNEL_OPT_NARCNET() at 0
sys_reboot() at sys_reboot
db_fncall() at db_fncall
db_command() at db_command+0x127
db_command_loop() at db_command_loop+0xa6
db_trap() at db_trap+0xe6
kdb_trap() at kdb_trap+0xe1
trap() at trap+0x2b7
--- trap (number 6) ---
vdev_disk_open.part.4() at vdev_disk_open.part.4+0x49a
vdev_open() at vdev_open+0x9e
vdev_open_children() at vdev_open_children+0x39
vdev_root_open() at vdev_root_open+0x33
vdev_open() at vdev_open+0x9e
spa_load() at spa_load+0x38e
spa_load_best() at spa_load_best+0x58
spa_open_common() at spa_open_common+0xc2
pool_status_check.part.25() at pool_status_check.part.25+0x1e
zfsdev_ioctl() at zfsdev_ioctl+0x80e
nb_zfsdev_ioctl() at nb_zfsdev_ioctl+0x38
VOP_IOCTL() at VOP_IOCTL+0x44
vn_ioctl() at vn_ioctl+0xa5
sys_ioctl() at sys_ioctl+0x550
syscall() at syscall+0x26e
--- syscall (number 54) ---
syscall+0x26e:
.....

Any idea what is going on? I've restarted a build, but the cvs log
doesn't show anything relevant as far as I can see.


Chavdar



-- 
----


Home | Main Index | Thread Index | Old Index