Subject: Re: boot hangs at uhci1
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: david l goodrich <dlg@dsrw.org>
List: port-xen
Date: 04/01/2007 12:27:14
Manuel Bouyer wrote:
> On Sun, Apr 01, 2007 at 11:57:27AM -0500, david l goodrich wrote:
>> Manuel Bouyer wrote:
>>> On Sun, Apr 01, 2007 at 10:54:16AM -0500, david l goodrich wrote:
>>>> Manuel Bouyer wrote:
>>>>> On Sat, Mar 31, 2007 at 05:22:20PM -0500, david l goodrich wrote:
>>>>>> I decided it would be a good idea to move a 4.0_BETA2 dom0 from 
>> under
>>>>>> my desk to over with my other servers.
>>>>>>
>>>>>> when i powered the machine back on after the move, the XEN3_DOM0
>>>>>> kernel won't boot, it hangs at
>>>>>>
>>>>>> uhci1 at pci0 dev 29 function 1: Intel 82801GB/GR USB UHCI 
>> Controller
>>>>>> (rev. 0x01)
>>>>>>
>>>>>> but a GENERIC kernel will boot through grub's 'chainloader' just fine.
>>>>>>
>>>>>> i'd just say wipe the hard drives and start over, but this has domUs
>>>>>> and data on it that i'd like to keep.  and the stupid thing /used/ 
>> to
>>>>>> boot.
>>>>>>
>>>>>> my / drive is small, of course.
>>>>>>
>>>>>> nialas# df /
>>>>>> Filesystem  1K-blocks      Used     Avail Capacity  Mounted on
>>>>>> /dev/raid0a    508143    188655    294081    39%    /
>>>>>> nialas#
>>>>>>
>>>>>> the dmesg from GENERIC is below.  any ideas?  thanks.
>>>>> If you add -c to the kernel's boot command line, and enter
>>>>> disable uhci
>>>>> quit
>>>>>
>>>>> does it boot ?
>>>> for those reading this later on that are confused like i was, this is
>>>> just to say add "-c" to the end of the "module" line in the appropriate
>>>> /grub/menu.lst entry.
>>>>
>>>> unfortunately, all disabling uhci did was push the freeze back:
>>>>
>>>> piixide1 at pci0 dev 31 function 2
>>>> piixide1: Intel 82801G/GR Serial ATA/Raid Controller (ICH7) (rev. 0x01)
>>>> piixide1: bus-master DMA support present
>>>> piixide1: primary channel configured to native-PCI mode
>>>> [hang]
>>>>
>>>> i should note that the way I have determined it is hung is that 1) it
>>>> obviously doesn't progress any further in the boot process and 2) the
>>>> keyboard interrupts are no longer detected - capslock and numlock no
>>>> longer cause the keyboard lights to change.
>>> it's normal at this point of the boot, interrupts are not yet enbaled to
>>> the keyboard won't react anyway
>>>
>>>>> Maybe you have updated this kernel recently, without
>>>>> rebooting ?
>>>>>
>>>> no, this is not something I would do.  well, i hope not.
>>>> Besides, I have also tried booting with a new XEN3_DOM0 kernel from
>>>> ftp.netbsd.org's most recent daily build, and had the same problem.  In
>>>> fact, this most recent hang at piixide1 is using
>>>>
>> <ftp://ftp.netbsd.org/pub/NetBSD-daily/netbsd-4/200703280002Z/i386/binary/kernel/netbsd-XEN3_DOM0.gz>, 
>>
>>>> gunzipped, of course.
>>> It seems to have troubles with establishing interrupts. Can you try 
>> disabling
>>> ACPI ?
>>> disable acpi
>>> in userconf.
>>>
>> and now, hung at
>>
>> Starting xen domains.
>> Using config file "/usr/pkg/etc/xen/router-meus".
>> xvif1.0: Ethernet address 00:16:3e:6d:73:75
>> xvif1.1: Ethernet address 00:16:3e:4c:db:4c
>> xbd backend: attach device vnd0d (size 10485760) for domain 1
>> Started domain router-meus
>> Creating a.out runtime link editor directory cache.
>> xbd backend 0x1 for domain 1 using event channel 14
>>
>> the lines matching /^x/ are green, the others are white.
>>
>> just now it added the line "Checking quotas: done."  There was a 5-10 
>> minute gap between the "xbd" line and the "Checking" line.
>>
>> i've seen slow boots, but this is just about the slowest I've seen.  Not 
>> something i'd expect from a 3ghz computer.
> 
> Hum, you should probably add acpi=off on the kernel line in grub (it's
> an argument for xen.gz) when disabling acpi. This may be the cause of
> the slowness.

i assume this is not supposed to cause a kernel panic.

i added "acpi=off" to the kernel line in menu.lst, entered "disable 
acpi" and "quit" in userconf, and then got about fifteen lines further 
before it panicked

uc> quit
Continuing...
mainbus 0 (root)
mainbus0: scanning 0x9fc00 to 0x9fff0 for MP signature
mainbus0: scanning 0x9f800 to 0x9fbf0 for MP signature
mainbus0: scanning 0xf0000 to 0xffff0 for MP signature
mainbus0: MP floating pointer found in bios at 0xfe680
mainbus0: MP config table at 0xfe690, 64 bytes long
cpu0 at mainbus0: apid 0 (boot processor)
cpu1 at mainbus0: apid 1 (application processor)
ioapic0 at mainbus0 apid 2 (I/O APIC)
ioapic0: pa 0xfec00000PHYSDEVOP_APIC_READ ret -22
panic: PHYSDEVOP_APIC_READ
Stopped in pid 0.1 (swapper) at netbsd:cpu_Debugger+0x4     popl  %ebp
db>

i'll leave it at the db> prompt for now, let me know if there's any 
information that could be helpful, but 'ps' just listed swapper.

> 
> what's interesting is that it can't register some interrupts with APCI.
> Did you change something, like adding a PCI adapter ?
> 

scout's honor, all I did was shut the machine down, unplug it, move it 
15 feet, and plug it in again.  i haven't added hardware in a few weeks. 
  The only PCI adapter in there is an fxp(4) network card.
   --david