NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-amd64/54217: Biostar X370GT5 may fail to boot with SATA controller enabled



The following reply was made to PR port-amd64/54217; it has been noted by GNATS.

From: "John D. Baker" <jdbaker%consolidated.net@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: port-amd64/54217: Biostar X370GT5 may fail to boot with SATA
 controller enabled
Date: Fri, 31 May 2019 12:33:23 -0500 (CDT)

 I see what appears to be the same panic on an older HP Pavilion system
 that I sometimes use to netboot -current, mostly for testing "nouveau"
 (it's the one machine on which nouveau has always worked).
 
 (The machine normally boots a recent Linux Mint from local disk.)
 
 NetBSD dpe2850c.technoskunk.fur 8.99.41 NetBSD 8.99.41 (GENERIC) #255: Thu May 23 23:15:52 CDT 2019 sysop%yggdrasil.technoskunk.fur@localhost:/r0/build/current/obj/amd64/sys/arch/amd64/compile/GENERIC amd64
 
 NetBSD 8.99.41 (GENERIC) #255: Thu May 23 23:15:52 CDT 2019
         sysop%yggdrasil.technoskunk.fur@localhost:/r0/build/current/obj/amd64/sys/arch/amd64/compile/GENERIC
 total memory = 8191 MB
 avail memory = 7927 MB
 WARNING: module error: module `nfs' pushed by boot loader already exists
 timecounter: Timecounters tick every 10.000 msec
 Kernelized RAIDframe activated
 running cgd selftest aes-xts-256 aes-xts-512 done
 userconf: configure system autoconfiguration:
 uc> disable nouveau
 nouveau* disabled
 uc> exit
 Continuing...
 timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
 HP-Pavilion NP218AA-ABA p6142p ( )
 [...]
 cpu0 at mainbus0 apid 0
 cpu0: AMD Phenom(tm) 9650 Quad-Core Processor, id 0x100f23
 cpu0: package 0, core 0, smt 0
 cpu1 at mainbus0 apid 1
 cpu1: AMD Phenom(tm) 9650 Quad-Core Processor, id 0x100f23
 cpu1: package 0, core 1, smt 0
 cpu2 at mainbus0 apid 2
 cpu2: AMD Phenom(tm) 9650 Quad-Core Processor, id 0x100f23
 cpu2: package 0, core 2, smt 0
 cpu3 at mainbus0 apid 3
 cpu3: AMD Phenom(tm) 9650 Quad-Core Processor, id 0x100f23
 cpu3: package 0, core 3, smt 0
 
 I forget just when it started doing this, but recent -current (probably
 starting with 8.99.35 maybe?) will panic during disk detection with (hand
 transcribed as machine has no serial port):
 
 [...]
 wd1 at atabus2 drive 0
 panic: kernel diagnostic assertion "mutex_owned(&chp->ch_lock)" failed: file "/x/current/src/sys/dev/ata/ata_subr.c", line 275
 cpu0: Begin traceback...
 vpanic() at netbsd:vpanic+0x160
 stge_eeprom_wait.isra.4() at netbsd:stge_eeprom_wait.isra.4
 ahci_reset_drive() at netbsd:ahci_reset_drive+0x2b
 wd_get_params.constprop.5() at netbsd:wd_get_params.constprop.5+0x9a
 wdattach() at netbsd:wdattach+0x104
 config_attach_loc() at netbsd:config_attach_loc+0x1a5
 config_found_sm_loc() at netbsd:config_found_sm_loc+0s48
 atabusconfig_thread() at netbsd:atabusconfig_thread+0x2f1
 cpu0: End traceback...
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip 0xffffffff8021ddad cs 0x8 rflags 0x202 cr2 0 ilevel 0 rsp 0xffffba80afa37cb0
 curlwp 0xffff9b4309e29240 pid 0.83 lowest kstack 0x ffffba80afa342c0
 Stopped in pid 0.83 (system) at netbsd:breakpoint+0x5:  leave
 db{0}> 
 
 The machine has only one disk, wd0, so the "wd1 at atabus2 drive 0" is
 spurious.  It should instead be atapibus0 and cd0.  Looking at a dmesg.boot
 where it booted successfully:
 
 [...]
 ahcisata0 at pci0 dev 9 function 0: NVIDIA nForce MCP77 AHCI Controller (rev. 0xa2)
 LSA0: Picked IRQ 21 with weight 1
 ahcisata0: 64-bit DMA
 ahcisata0: ignoring broken port multiplier support
 ahcisata0: AHCI revision 1.20, 4 ports, 32 slots, CAP 0xe3209f03<PMD,ISS=0x2=Gen2,SCLO,SAL,SSNTF,SNCQ,S64A>
 ahcisata0: interrupting at ioapic0 pin 21
 atabus0 at ahcisata0 channel 0
 atabus1 at ahcisata0 channel 1
 atabus2 at ahcisata0 channel 2
 atabus3 at ahcisata0 channel 3
 [...]
 ahcisata0 port 1: device present, speed: 3.0Gb/s
 ahcisata0 port 2: device present, speed: 1.5Gb/s
 autoconfiguration error: ahcisata0 port 2: clearing WDCTL_RST failed for drive 0
 autoconfiguration error: ahcisata0 port 1: clearing WDCTL_RST failed for drive 0
 ehci1: handing over low speed device on port 2 to ohci1
 wd0 at atabus1 drive 0
 wd0: <WDC WD6400AAKS-65A7B2>
 wd0: drive supports 16-sector PIO transfers, LBA48 addressing
 wd0: 596 GB, 1240341 cyl, 16 head, 63 sec, 512 bytes/sect x 1250263728 sectors
 wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133), NCQ (32 tags)
 wd0(ahcisata0:1:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) (using DMA), NCQ (31 tags)
 atapibus0 at atabus2: 1 targets
 cd0 at atapibus0 drive 0: <ATAPI   DVD A  DH16A6L-C, 249920422616, ZHCH> cdrom removable
 cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
 cd0(ahcisata0:2:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) (using DMA)
 
 An interesting feature is that stge(4) code seems always to be involved
 although the machine does not have an stge(4) interface (it has nfe(4)).
 
 The panic occurs during the period when "nouveau" is attaching so the
 screen is blank.  At first I thought it was a "nouveau" regression so
 I disabled "nouveau" via userconf.  That's when I saw the actual panic.
 
 
 It seems most likely to do this after power-on.  If one disables "atabus",
 the machine will boot (netboot).  On warm reboot, it may succeed without
 disabling devices.
 
 The machine has a PS/2 keyboard, so even though the screen is black
 (panic occurs while "nouveau" is taking over the display) one can type
 blindly at the DDB prompt to reset the machine and try again.
 
 The "stge(4)" note may be a red herring.  Disabling stge in userconf has
 no effect on the panic.
 
 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]consolidated[flyspeck]net  OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645
 


Home | Main Index | Thread Index | Old Index