Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: file system panic on -current (amd64)



On Sat, 29 Jul 2017, coypu%sdf.org@localhost wrote:

On Fri, Jul 28, 2017 at 07:23:30PM +0800, Paul Goyette wrote:

(manually transcribed - my USB keyboard attached via XHCI doesn't
play nicely with ddb):

Sometimes I disable all uhci, ehci, xhci to get activity in ddb based on
PS/2 emulation. I think it worked once...

Since my keyboard is connected via xhci, it would not be helpful
to disable it.  :)

sysctl -w ddb.onpanic=0 will dump core, which is probably preferable.
And if it fails, it's sometimes visible in dmesg scrollback.

Yeah, I should set onpanic to zero until xhci works in ddb.

	wapbl_stop
	wapbl_begin + 0x5b
	genfs_do_putpages + x1014
	VOP_PUTPAGES + 0x3a
	ffs_full_fsync + 0xe5
	ffs_fsync + 0x3a
	VOP_FSYNC + 0x3a
	sched_sync + 0x198

What's the panic reason?

I forgot to write that down.  :(

Do you have a netbsd.gdb for line numbers (especially of the last
part)?

Yeah I have netbsd.gdb but it doesn't help without a dump file.
Both ffs and wapbl are separately-loaded modules, and not part
of the netbsd.gdb image.

Based on an equivalent GENERIC kernel:

Frame 1:
(gdb) list *wapbl_stop
0xffffffff80a05227 is in wapbl_stop (/usr/src/sys/kern/vfs_wapbl.c:845).
840             rw_exit(&wl->wl_rwlock);
841     }
842
843     int
844     wapbl_stop(struct wapbl *wl, int force)
845     {
846             int error;
847
848             WAPBL_PRINTF(WAPBL_PRINT_OPEN, ("wapbl_stop called\n"));
849             error = wapbl_flush(wl, 1);
(gdb)

which indicates that the call frame points to the exit of the
routine _preceeding_ wapbl_stop() - according to my sources, that
would be wapbl_discard().

Frame 2:
(gdb) list *wapbl_begin+0x5b
0xffffffff80a05631 is in wapbl_begin (/usr/src/sys/kern/vfs_wapbl.c:1235).
1230                        wl->wl_dealloccnt, wl->wl_dealloclim));
1231            }
1232
1233            if (doflush) {
1234                    int error = wapbl_flush(wl, 0);
1235                    if (error)
1236                            return error;
1237            }
1238
1239            rw_enter(&wl->wl_rwlock, RW_READER);
(gdb)

So it's at the return from wapbl_flush().

Since there's no explicit panic() checks, I'm guessing that the
panic was either a result of dereferencing a NULL pointer for wl,
or some corruption of the contents of the wl->wl_rwlock member.


+------------------+--------------------------+----------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:          |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+------------------+--------------------------+----------------------------+


Home | Main Index | Thread Index | Old Index