tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 6.1_RC3 panic on shutdown: AcpiPsParseLoop ()



On Fri, Apr 05, 2013 at 07:38:12PM +0200, rudolf wrote:
> >can you boot a 6.0 kernel and see if the error messages also show up with
> >that version?  if they do then that would rule out any of my recent changes.
> 
> Yes, the error messages also show up with 6.0 kernel:
> 
> Today I made around 60 attempts with RC3 and only two of them ended
> with acpi error messages (snapshot of one of them:
> http://software.eq.cz/acpi_error_messages_rc3.jpg) and there was no
> panic.
> 
> I had more luck with the 6.0, fourth attempt ended with the errors
> (again, no panic): http://software.eq.cz/acpi_error_messages_60.jpg
> I also tried to provoke a dump at that moment, but it's probably not
> possible after the dump device (wd0) is detached:
> http://software.eq.cz/acpi_error_messages_60_nodump.jpg

ok, the problem isn't new, that's good.

the "Mutex [0x4]" indicates that the error has to do with ACPI_MTX_CACHES.
I looked at all the uses of that mutex and I don't see anything wrong.

I think the problem this bit in AcpiOsWaitSemaphore() (which is used by
our implementation of an ACPICA mutex):

        if (cold || doing_shutdown || acpi_suspended)
                return AE_OK;

so during shutdown, any thread that tries to take an ACPICA mutex will
think that it succeeded, even if another thread already holds it.
AcpiUtAcquireMutex() will mark the mutex as being held by the thread
that supposedly has the lock, so the second thread will overwrite
the mutex ThreadId with its own ID, then when the first thread goes to
release the mutex, it will report the error that we're seeing.

the "doing_shutdown" part of that check was added in this commit:

----------------------------
revision 1.12
date: 2009-03-31 10:17:47 -0700;  author: drochner;  state: Exp;  lines: +3 -3;
avoid tsleep also during shutdown (and in particular ACPI poweroff),
should fix PR kern/39141 by Takahiro Kambe and PR port-i386/41110
by Reinoud Zandijk
----------------------------


this change wasn't the right way to fix the earlier problem.
in fact, the whole business of skipping the actual locking while acquiring
an ACPICA mutex is just asking for trouble.  instead of skipping the locking
within ACPICA in these situations, we should instead arrange for all other
threads to be stopped at a point where they don't hold any locks, so that
we can be sure that the thread doing the shutdown (which can't sleep anymore
at that point) will be able to take any ACPICA mutex without sleeping.

the "cold" part of the check seems unnecessary as well.  at the point where
cold is cleared, it looks like no other threads could possibly have run yet,
so the boot thread should always be able to get an ACPICA mutex without
sleeping.

-Chuck


Home | Main Index | Thread Index | Old Index