Subject: Re: kern/35553: azalia hangs an Optiplex 745
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Michael L. Hitch <mhitch@lightning.msu.montana.edu>
List: netbsd-bugs
Date: 08/27/2007 21:05:09
The following reply was made to PR kern/35553; it has been noted by GNATS.

From: "Michael L. Hitch" <mhitch@lightning.msu.montana.edu>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, mark@mcs.vuw.ac.nz
Subject: Re: kern/35553: azalia hangs an Optiplex 745
Date: Mon, 27 Aug 2007 13:55:37 -0600 (MDT)

 On Fri, 24 Aug 2007, Michael L. Hitch wrote:
 
 > So all these problems seem to come back the the uhci/ehci interaction with
 > azalia and bge.  One thing to note is that all three of these device are
 > sharing the same interrupt.
 
    And indeed, the interrupt sharing is causing the hang.  When azalia0 is 
 running the codec initialization, interrupts have been enabled and the 
 intialization is using interrupts.  Each time azalia0 interrupts, the 
 interrupt handlers for azalia0, bge0, and uhci0 are called.  Normall this 
 should't cause any undue problems other than extra processing per 
 interrupt, but the uhci interrupt handler does not deal intelligently with 
 the shared interrupt.  By the time the azalia codec initialization runs, 
 something appears to have halted the uhci0 controller, and its status 
 register contains UHCI_STS_HCH.  That status doesn't actually indicate an 
 interrupt as best as I have been able to determine, but the interrupt 
 handler looks at it, outputs a message about the controller halted, and 
 disables access to the controller.  After the azalia codec intialization 
 is complete, all the USB event process start running.  Each process has to 
 complete it's initial discovery task before the autoconfig stuff is 
 completed, and the process for uhci0 never completes (presumably because 
 access to uhci0 was disabled by the uhci interrupt handler), and the 
 system hangs waiting for it to complete.
 
    When the azalia is disabled, the system will come up because all the USB 
 event tasks complete the initial discovery.  However, when bge0 
 interrupts, it also causes the uhci interrupt handler to run and detects 
 the uhci0 controller halted status.
 
    I've gotten around this problem by changing the uhci interrupt to not
 check for the UHCI_STS_HCH as a valid interrupt bit, and now my system 
 boots and runs normally.
 
    I'm trying to understand how uhci0 (and uhci1 as well) get halted, but 
 haven't figured that out yet.
 
 --
 Michael L. Hitch			mhitch@montana.edu
 Computer Consultant
 Information Technology Center
 Montana State University	Bozeman, MT	USA