NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/56745: ATA bus interrupt handling causes permanent channel freeze



>Number:         56745
>Category:       kern
>Synopsis:       ATA bus interrupt handling causes permanent channel freeze
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Mar 13 15:15:00 +0000 2022
>Originator:     Konrad Schroder
>Release:        9.2
>Organization:
University of Washington
>Environment:
NetBSD apu1c.bastion.coral.washington.edu 9.2_STABLE NetBSD 9.2_STABLE (APU1C) #24: Sat Mar 12 21:45:40 PST 2022  root%backup.coral.washington.edu@localhost:/backups/ovar/src-9/sys/arch/i386/compile/obj/APU1C i386

>Description:
When a read error occurs (and possibly at other times), ata_thread_run can be called twice with flags=0 (wake up thread, do not wait) before the worker thread runs.  This results in an extra freeze because the second wakeup is not accounted.  In a kernel with

options ATADEBUG
options ATADEBUG_WD_MASK=0x38
options ATADEBUG_MASK=0x38

this manifests as

[ 11284.1587904] ata_channel_freeze_locked(chp=0xc3cf0120) -> 1
[ 11284.1787594] ata_channel_thaw_locked(chp=0xc3cf0120) -> 0
[ 11284.1787594] ata_thread_run flags 0x0 ch_flags 0x0 type 0x8000 arg 0x481
[ 11284.1787594] ata_channel_freeze_locked(chp=0xc3cf0120) -> 1                         -- interrupt is over, channel has flags 0x8000
[ 11284.2252056] ata_channel_freeze_locked(chp=0xc3cf0120) -> 2
[ 11284.2371669] ata_channel_thaw_locked(chp=0xc3cf0120) -> 1
[ 11284.2371669] ata_thread_run flags 0x0 ch_flags 0x8000 type 0x8000 arg 0x8451
[ 11284.2371669] ata_channel_freeze_locked(chp=0xc3cf0120) -> 2                         -- this extra freeze will never be released

At this point wd0 effectively no longer functions.

>How-To-Repeat:
Get a computer with a flaky wd and exercise it.  There may be other requirements to ensure that you get two interrupts before the thread runs, but I can trigger it easily on an APU1C.
>Fix:
I'll commit this once I'm done testing in -current (today or tomorrow):

--- sys/dev/ata/ata.c   25 May 2019 16:30:18 -0000      1.149
+++ sys/dev/ata/ata.c   13 Mar 2022 05:36:15 -0000
@@ -1588,14 +1625,16 @@
                 * Block execution of other commands while reset is scheduled
                 * to a thread.
                 */
-               ata_channel_freeze_locked(chp);
-               chp->ch_flags |= type;
+               if (!(chp->ch_flags & type)) {
+                       ata_channel_freeze_locked(chp);
+                       chp->ch_flags |= type;
+               }

                cv_signal(&chp->ch_thr_idle);
                return;
        }

        /* Block execution of other commands during reset */
        ata_channel_freeze_locked(chp);

        /*



Home | Main Index | Thread Index | Old Index