NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/56982: mutex error (locking against myself) in wdc(4) NOIRQ case?



The following reply was made to PR kern/56982; it has been noted by GNATS.

From: Izumi Tsutsui <tsutsui%ceres.dti.ne.jp@localhost>
To: riastradh%NetBSD.org@localhost
Cc: gnats-bugs%NetBSD.org@localhost, tsutsui%ceres.dti.ne.jp@localhost
Subject: Re: kern/56982: mutex error (locking against myself) in wdc(4) NOIRQ
	 case?
Date: Sun, 28 Aug 2022 20:17:24 +0900

 > > [  26.1879870] mutex_spin_enter(a97148,4e92b8,a99f30,0,3ea16) + 6e
 > > [  26.1879870] wddone(a32688,a99f30,98a0a0,a99f30) + 256
 > 
 > Can you get a line number for wddone+0x256?
 
 This was INSTALL kernel without LOCKDEBUG,
 so I've tried NetBSD/mac68k 9.3 GENERIC
 + options DIAGNOSTIC, DEBUG, LOCKDEBUG + makeoptions DEBUG="-g"
 + ata NOIRQ fixes:
  https://github.com/NetBSD/src/commit/d034868f5308ed012da088cd2a6112443061753a
 + wd(4) attach verbose message fix:
  https://github.com/NetBSD/src/commit/4a0ce12e9902cb0dfb2f2d8ca9e2d665028704c0
 + sys/arch/m68k/m68k/m68k_trap.c r1.3 for DEBUG builds:
  https://github.com/NetBSD/src/commit/9fa6e849aae6c36d6840abd3ec9e23a528c914f0
 
 ---
 [   1.0000000] Loaded initial symtab at 0x427c2c, strtab at 0x4854ac, # entries 23929
 [   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
 [   1.0000000]     2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
 [   1.0000000]     2018, 2019, 2020, 2021, 2022
 [   1.0000000]     The NetBSD Foundation, Inc.  All rights reserved.
 [   1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
 [   1.0000000]     The Regents of the University of California.  All rights reserved.
 
 [   1.0000000] NetBSD 9.3 (DEBUG) #0: Sun Aug 28 19:39:11 JST 2022
 [   1.0000000]  tsutsui@mirage:/s/netbsd-9/src/sys/arch/mac68k/compile/DEBUG
 [   1.0000000] Apple Macintosh Quadra 630  (68040)
  :
 [   1.0000000] wdc0 at obio0 (Quadra/Performa series IDE interface)
 [   1.0000000] atabus0 at wdc0 channel 0
  :
 [   7.8156563] wd0 at atabus0 drive 0
 [   7.8990036] wd0: <IBM-DALA-3360>
 [   7.9393950] wd0: 348 MB, 929 cyl, 16 head, 48 sec, 512 bytes/sect x 713472 sectors
  :
 # disklabel wd0
 [  30.5354939] Mutex error: mutex_vector_enter,477: locking against myself
 
 [  30.5354939] lock address : 0x0000000000a1b948 type     :               spin
 [  30.5354939] initialized  : 0x0000000000047dac
 [  30.5354939] shared holds :                  0 exclusive:                  1
 [  30.5354939] shares wanted:                  0 exclusive:                  1
 [  30.5354939] current cpu  :                  0 last held:                  0
 [  30.5354939] current lwp  : 0x0000000000a2a600 last held: 0x0000000000a2a600
 [  30.5354939] last locked* : 0x0000000000047592 unlocked : 000000000000000000
 [  30.5354939] owner field  : 0x0000000000058000 wait/spin:                0/1
 
 [  30.5354939] panic: LOCKDEBUG: Mutex error: mutex_vector_enter,477: locking against myself
 [  30.5354939] cpu0: Begin traceback...
 [  30.5354939] ?(?)
 [  30.5354939] db_panic(ffff2600,600,365283,938580,52358ec) at 0
 [  30.5354939] vpanic(36bec8,52358f8,5235928,22ae62,36bec8) + 162
 [  30.5354939] panic(36bec8,365231,365283,1dd,365163) + c
 [  30.5354939] lockdebug_abort1(?)
 [  30.5354939] vmem_alloc(365283,1dd,938580,ffff2600,365163,1) + b0
 [  30.5354939] lockdebug_wantlock(365283,1dd,a1b948,46938,0) + 11a
 [  30.5354939] mutex_enter(a1b948,3fbbb0,a1df30,0,8d60a0) + 68
 [  30.5354939] wddone(99c688,a1df30,8d60a0,a1df30) + 2c4
 [  30.5354939] wdc_ata_bio_done(?)
 [  30.5354939] ncr53c9x_intr(8d6098,a1df30,8d60a0) + 62
 [  30.5354939] wdc_ata_bio_intr(?)
 [  30.5354939] ncr53c9x_intr(8d6098,a1df30,0,1,a1df30) + 1d4
 [  30.5354939] wdc_ata_bio_poll(8d6098,a1df30,0,1,4caf4) + 1c
 [  30.5354939] ata_xfer_start(a1df30) + 112
 [  30.5354939] atastart(?)
 [  30.5354939] ata_exec_xfer(8d6098,a1df30,a1df30,a1b808,5235b14) + 364
 [  30.5354939] wdc_ata_bio(99a008,a1df30) + 62
 [  30.5354939] wdstart1(?)
 [  30.5354939] ncr53c9x_msgin(a1b808,a20aa0,a1df30,a1b8a0,1f675a) + 212
 [  30.5354939] wd_diskstart(99c688,a20aa0,a1b8a0,963990,c) + c4
 [  30.5354939] dk_start(?)
 [  30.5354939] dk_strategy(a1b808,a20aa0) + ec
 [  30.5354939] wdstrategy(a20aa0,4200,2,c,0) + 52
 [  30.5354939] readdisklabel.part.1(?)
 [  30.5354939] readdisklabel(0,3102,473e0,90fbc8,8e21c8) + 62
 [  30.5354939] dk_getdisklabel(a1b808,0,3102) + 7e
 [  30.5354939] dk_open(a1b808,0,3102,1,2000,a2a600) + ce
 [  30.5354939] wdopen(?)
 [  30.5354939] cdev_open(0,3102,1,2000,a2a600) + 8e
 [  30.5354939] spec_open(5235d4c,377f34,a47dd4,1,921ee0) + 13c
 [  30.5354939] VOP_OPEN(a47dd4,1,921ee0) + 2c
 [  30.5354939] vn_open(5235e68,1,0) + 286
 [  30.5354939] do_open(a2a600,0,8df100,0,0,5235efc) + 92
 [  30.5354939] do_sys_openat(?)
 [  30.5354939] sys_fsetxattr(a2a600,ffffff9c,f104,0,0,5235efc) + e2
 [  30.5354939] sys_open(a2a600,5235f38,5235f30,19,0) + 1e
 [  30.5354939] syscall_plain(5,a2a600,5235fb4,f104,0) + d2
 [  30.5354939] syscall(5) + 70
 [  30.5354939] trap0() + e
 [  30.5354939] cpu0: End traceback...
 Stopped in pid 5.1 (disklabel) at       netbsd:cpu_Debugger+0x6:        unlk
 a6
 db>
 ---
 
 > $ gdb ./netbsd.gdb
 > (gdb) info line *(wddone+0x256)
 
 ---
 (gdb) info line *(wddone+0x2c4)
 Line 1017 of "../../../../dev/ata/wd.c" starts at address 0x46934 <wddone+706>
    and ends at 0x46938 <wddone+710>.
 (gdb) 
 ---
 
 This is here:
  https://nxr.netbsd.org/xref/src/sys/dev/ata/wd.c?r=1.452.2.2#1017
 ---
     862 static void
     863 wddone(device_t self, struct ata_xfer *xfer)
     864 {
      :
    1015 	ata_free_xfer(wd->drvp->chnl_softc, xfer);
    1016 
    1017 	mutex_enter(&wd->sc_lock);
    1018 	wd->inflight--;
    1019 	mutex_exit(&wd->sc_lock);
    1020 	dk_done(dksc, bp);
    1021 	dk_start(dksc, NULL);
    1022 }
 ---
 
 > Can you also show the full lockdebug info?  It looks like it got
 > garbled in the console output.
 > 
 > db> show lock 0x0000000000a97148
 
 ---
 db> show lock 0x0000000000a1b948
 lock address : 0x0000000000a1b948 type     :               spin
 initialized  : 0x0000000000047dac
 shared holds :                  0 exclusive:                  1
 shares wanted:                  0 exclusive:                  1
 current cpu  :                  0 last held:                  0
 current lwp  : 0x0000000000a2a600 last held: 0x0000000000a2a600
 last locked* : 0x0000000000047592 unlocked : 000000000000000000
 owner field  : 0x0000000000058000 wait/spin:                0/1
 db>
 ---
 
 > This should display two lines of particular interest (with different
 > addresses, obviously):
 > 
 > initialized  : 0x0000000000abcdef
 > ...
 > last locked* : 0x0000000000123456 unlocked : 0x0000000abcd0123
 > 
 > Can you show line numbers for the `initialized' and `last locked'
 > addresses?
 > 
 > (gdb) info line *(0x0000000000abcdef)
 > (gdb) info line *(0x0000000000123456)
 
 >> initialized  : 0x0000000000047dac
 ---
 (gdb) info line *(0x0000000000047dac)
 Line 323 of "../../../../dev/ata/wd.c" starts at address 0x47dac <wdattach+46>
    and ends at 0x47db0 <wdattach+50>.
 ---
 
 Maybe this one:
  https://nxr.netbsd.org/xref/src/sys/dev/ata/wd.c?r=1.452.2.2#319
 ---
     305 static void
     306 wdattach(device_t parent, device_t self, void *aux)
     307 {
      :
     319 	mutex_init(&wd->sc_lock, MUTEX_DEFAULT, IPL_BIO);
     320 #ifdef WD_SOFTBADSECT
     321 	SLIST_INIT(&wd->sc_bslist);
     322 #endif
     323 	wd->atabus = adev->adev_bustype;
 ---
 
 
 >> last locked* : 0x0000000000047592
 ---
 (gdb) info line *(0x0000000000047592)
 Line 789 of "../../../../dev/ata/wd.c"
    starts at address 0x47592 <wd_diskstart+36>
    and ends at 0x4759a <wd_diskstart+44>.
 ---
 
 Also looks this:
  https://nxr.netbsd.org/xref/src/sys/dev/ata/wd.c?r=1.452.2.2#787
 ---
     775 static int
     776 wd_diskstart(device_t dev, struct buf *bp)
     777 {
     778 	struct wd_softc *wd = device_private(dev);
     779 #ifdef ATADEBUG
     780 	struct dk_softc *dksc = &wd->sc_dksc;
     781 #endif
     782 	struct ata_xfer *xfer;
     783 	struct ata_channel *chp;
     784 	unsigned openings;
     785 	int ticks;
     786 
     787 	mutex_enter(&wd->sc_lock);
     788 
     789 	chp = wd->drvp->chnl_softc;
     790 
     791 	ata_channel_lock(chp);
 ---
 Izumi Tsutsui
 


Home | Main Index | Thread Index | Old Index