Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: New panic in wdc_ata_bio_intr



Well, that was a good one. Running just fine now:

~ uname -a
NetBSD nt61p.lorien.lan 8.99.4 NetBSD 8.99.4 (GENERIC) #1: Mon Oct 16 20:01:05 BST 2017  sysbuild%nt61p.lorien.lan@localhost:/home/sysbuild/src/sys/arch/amd64/compile/GENERIC amd64
~ dmesg | grep wd0
wd0 at atabus0 drive 0
wd0: <Hitachi HTS725032A9A364>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 298 GB, 620181 cyl, 16 head, 63 sec, 512 bytes/sect x 625142448 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100), NCQ (32 tags)
wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
boot device: wd0
root on wd0a dumps on wd0b
~ atactl wd0 identify
Model: Hitachi HTS725032A9A364, Rev: PC3OCH0A, Serial #: 110320PCKC04BPJ53MLK
World Wide Name: 5000CCA645DE827F
Device type: ATA, fixed
Capacity 320 Gbytes, 625142448 sectors, 512 bytes/sector
Cylinders: 16383, heads: 16, sec/track: 63
Command queue depth: 32
Device capabilities:
        DMA
        LBA
        IORDY operation
        IORDY disabling
Device supports following standards:
ATA-2 ATA-3 ATA-4 ATA-5 ATA-6 ATA-7 ATA-8
Command set support:
        NOP command (enabled)
        READ BUFFER command (enabled)
        WRITE BUFFER command (enabled)
        Look-ahead (enabled)
        Write cache (enabled)
        Power Management feature set (enabled)
        Security Mode feature set (disabled)
        SMART feature set (enabled)
        FLUSH CACHE EXT command (enabled)
        FLUSH CACHE command (enabled)
        Device Configuration Overlay feature set (enabled)
        48-bit Address feature set (enabled)
        Advanced Power Management feature set (enabled)
        DOWNLOAD MICROCODE command (enabled)
        World Wide Name
        General Purpose Logging feature set
        SMART self-test
        SMART error logging
Serial ATA capabilities:
        1.5Gb/s signaling
        3.0Gb/s signaling
        Native Command Queuing
        PHY Event Counters
Serial ATA features:
        DMA Setup Auto Activate (disabled)
        Device-Initiated Interface Power Managment (disabled)
        Software Settings Preservation (enabled)

Anything else to test? 

Chavdar Ivanov

On Mon, 16 Oct 2017 at 19:07 Jaromír Doleček <jaromir.dolecek%gmail.com@localhost> wrote:
Okay, can you try following patch? It puts puts back a flag for IRQ handling. If it works, I might have an idea what's happening. I think there is some rogue interrupt disturbing the state.

If it doesn't work, can you please try to compile kernel with ATADEBUG, and set atadebug_mask (possibly via ddb during boot) to 0x40? 

Jaromir

2017-10-15 23:10 GMT+02:00 Chavdar Ivanov <ci4ic4%gmail.com@localhost>:
Sorry, it still crashes the same way. I made sure all was updated before trying, I do have

ident /netbsd  | grep wdc
     $NetBSD: atapi_wdc.c,v 1.128 2017/10/10 21:37:49 jdolecek Exp $
     $NetBSD: ata_wdc.c,v 1.108 2017/10/15 11:27:14 jdolecek Exp $
     $NetBSD: wdc_isa.c,v 1.60 2017/10/07 16:05:32 jdolecek Exp $
     $NetBSD: wdc_pcmcia.c,v 1.125 2017/10/07 16:05:33 jdolecek Exp $
     $NetBSD: wdc.c,v 1.285 2017/10/15 18:02:33 jdolecek Exp $

and the panic is exactly the same. 

I am sure I will sort out my problem on this particular machine if I swap the internal SSD and the one in the DVD bay, placing the NetBSD root in the proper place, but nevertheless the panic may indicate some other unfinished work, so I shall keep it as it is for testing. 

Chavdar Ivanov 

On Sun, 15 Oct 2017 at 19:03 Jaromír Doleček <jaromir.dolecek%gmail.com@localhost> wrote:
Hi,

should be fixed in rev. 1.285 of dev/ic/wdc.c, can you please check?

Jaromir

2017-10-14 17:48 GMT+02:00 Chavdar Ivanov <ci4ic4%gmail.com@localhost>:
It still panics the same way, no difference. 

On my other laptop, an HP EliteBook, I haven't the problem at all, only on the two T61p's (one of them stopped working a week ago, though). 

Chavdar Ivanov 


On Sat, 14 Oct 2017 at 15:45 Jaromír Doleček <jaromir.dolecek%gmail.com@localhost> wrote:
Sorry, this fixed patch

2017-10-14 16:23 GMT+02:00 Jaromír Doleček <jaromir.dolecek%gmail.com@localhost>:
Can you try attached patch?

Jaromir

2017-10-11 1:04 GMT+02:00 Chavdar Ivanov <ci4ic4%gmail.com@localhost>:
The timeouts when running under VirtualBox disappeared, but of course the panic on my T61p remains.

Chavdar Ivanov

On Tue, 10 Oct 2017 at 22:40 Jaromír Doleček <jaromir.dolecek%gmail.com@localhost> wrote:
Hey,

can you try with dev/scsipi/atapi_wdc.c 1.128? That should resolve the timeouts for atapi, at least it did for me.

Jaromir

2017-10-10 8:08 GMT+02:00 Rares Aioanei <bsdlisten%gmail.com@localhost>:
I get that also on VBox, except it doesn't try to add cd0a as a swap
device, nor does it show an endless stream of "lost interrupt"
messages; eventually I get a login prompt. This is with yesterday's
latest -CURRENT.

On Sun, Oct 8, 2017 at 5:17 PM, Chavdar Ivanov <ci4ic4%gmail.com@localhost> wrote:
> I tried the same kernel on a VirtualBox guest - it doesn't crash, but one
> gets endless
>
> piixide0:1:0: lost interrupt
>         type: atapi tc_bcount: 0 tc_skip: 0
>
> stream of messages. Also /etc/rc.d/swap2 start hangs while trying to add
> /dev/cd0a as a dump device... as shown by ktruss.
>
> Weird.
>
> Chavdar
>
> On Sun, 8 Oct 2017 at 11:55 Chavdar Ivanov <ci4ic4%gmail.com@localhost> wrote:
>>
>> System updated about two hours ago. I am getting:
>>
>> ....
>> wd0 at atabus0 drive 0
>> wd0: <Hitachi HTS725032A9A364>
>> wd0: drive supports 16-sector PIO transfers, LBA48 addressing
>> wd0: 298 GB, 620181 cyl, 16 head, 63 sec, 512 bytes/sect x 625142448
>> sectors
>> piixide0:0:0: bad state 0 in wdc_ata_bio_intr
>> panic: wdc_ata_bio_intr: bad state
>> fatal breakpoint trap in supervisor mode
>> trap type 1 code 0 rip 0xffffffff8021c0c5 cs 0x8 rflags 0x246 cr2 0 ilevel
>> 0x8 rsp 0xffffe40040003c38
>> curlwp 0xffffe4013bb27840 pid 0.2 lowest kstack 0xffffe400400002c0
>> Stopped at pid 0.2 (system) at netbsd:breakpoint+0x5: leave
>> db{0}> bt
>> breakpoint() at netbsd:breakpoint+0x5
>> vpanic() at netbsd:vpanic+0x140
>> snprintf() at netbsd:snprintf
>> wdc_ata_bio_poll() at netbsd:wdc_ata_bio_poll
>> intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
>> Xintr_ioapic_edge10() at netbsd:Xintr_ioapic_edge10+0xee
>> --- interrupt ---
>> x86_mwait() at netbsd:x86_mwait+0xd
>> acpicpu_cstate_idel_enter() at netbsd:acpicpu_cstate_idle_enter+0xdb
>> acpicpu_cstate_idle() at netbsd:acpicpu_cstate_idle+0xb6
>> idle_loop() at netbsd:idle_loop+0x18c
>> db{0}>
>> ....
>>
>> (that is on my usual ThinkPad T61p).
>>
>> Couldn't get a crash dump.
>>
>> Chavdar Ivanov
>>
>







Home | Main Index | Thread Index | Old Index