Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

amd64/i386 SMP lossage



With -current as of half an hour ago, a dual Quad-core box
can't boot.  I see this sort of thing:

        cpu0 at mainbus0 apid 0: Intel 686-class, 2500MHz, id 0x10676
        cpu1 at mainbus0 apid 4: Intel 686-class, 2504MHz, id 0x10676
        cpu2 at mainbus0 apid 1: Intel 686-class, 3878MHz, id 0x10676
        cpu3 at mainbus0 apid 5cpu3: failed to become ready
        cpu4 at mainbus0 apid 2: Intel 686-class, 2504MHz, id 0x10676
        cpu5 at mainbus0 apid 6cpu5: failed to become ready
        cpu6 at mainbus0 apid 3: Intel 686-class, 2505MHz, id 0x10676
        cpu7 at mainbus0 apid 7cpu7: failed to become ready
         ... 
        cpu3: failed to start 
        cpu5: failed to start 
        cpu7: failed to start
         [ hang ]
        db{0}> bt
        breakpoint() at netbsd:breakpoint+0x5
        comintr() at netbsd:comintr+0x53a
        Xintr_ioapic_edge24() at netbsd:Xintr_ioapic_edge24+0xef
        --- interrupt ---
        x86_pause() at netbsd:x86_pause+0x2
        pmap_do_remove() at netbsd:pmap_do_remove+0x143
        setredzone() at netbsd:setredzone+0x22
        cpu_lwp_fork() at netbsd:cpu_lwp_fork+0xcb
        lwp_create() at netbsd:lwp_create+0x250
        kthread_create() at netbsd:kthread_create+0xf5
        configure() at netbsd:configure+0xc7
        main() at netbsd:main+0x19a

All other (functioning) CPUs are idle at this stage.  I see the same
sorts of failures with both amd64 and i386 kernels.

A kernel from sources updated on 11 May 2008 16:17 UTC booted ok, but
got the "free 2: inuse 0, probable double free" on reboot that has
already been reported here.

The cpu's that fail seem rather random.  Here's a few boot attempts:

        cpu0 at mainbus0 apid 0: Intel 686-class, 2500MHz, id 0x10676
        cpu1 at mainbus0 apid 4: Intel 686-class, 2504MHz, id 0x10676
        cpu2 at mainbus0 apid 1: Intel 686-class, 3878MHz, id 0x10676
        cpu3 at mainbus0 apid 5cpu3: failed to become ready
        cpu4 at mainbus0 apid 2: Intel 686-class, 2504MHz, id 0x10676
        cpu5 at mainbus0 apid 6cpu5: failed to become ready
        cpu6 at mainbus0 apid 3: Intel 686-class, 2505MHz, id 0x10676
        cpu7 at mainbus0 apid 7cpu7: failed to become ready

        cpu0 at mainbus0 apid 0: Intel 686-class, 2500MHz, id 0x10676
        cpu1 at mainbus0 apid 4cpu1: failed to become ready
        cpu2 at mainbus0 apid 1: Intel 686-class, 10743MHz, id 0x10676
        cpu3 at mainbus0 apid 5cpu3: failed to become ready
        cpu4 at mainbus0 apid 2: Intel 686-class, 2500MHz, id 0x10676
        cpu5 at mainbus0 apid 6cpu5: failed to become ready
        cpu6 at mainbus0 apid 3: Intel 686-class, 2504MHz, id 0x10676
        cpu7 at mainbus0 apid 7cpu7: failed to become ready

        cpu0 at mainbus0 apid 0: Intel 686-class, 2500MHz, id 0x10676
        cpu1 at mainbus0 apid 4: Intel 686-class, 3878MHz, id 0x10676
        cpu2 at mainbus0 apid 1: Intel 686-class, 2505MHz, id 0x10676
        cpu3 at mainbus0 apid 5cpu3: failed to become ready
        cpu4 at mainbus0 apid 2: Intel 686-class, 2500MHz, id 0x10676
        cpu5 at mainbus0 apid 6: Intel 686-class, 3878MHz, id 0x10676
        cpu6 at mainbus0 apid 3: Intel 686-class, 2504MHz, id 0x10676
        cpu7 at mainbus0 apid 7: Intel 686-class, 5251MHz, id 0x10676

        cpu0 at mainbus0 apid 0: Intel 686-class, 2500MHz, id 0x10676
        cpu1 at mainbus0 apid 4cpu1: failed to become ready
        cpu2 at mainbus0 apid 1: Intel 686-class, 34087MHz, id 0x10676
        cpu3 at mainbus0 apid 5cpu3: failed to become ready
        cpu4 at mainbus0 apid 2: Intel 686-class, 2505MHz, id 0x10676
        cpu5 at mainbus0 apid 6: Intel 686-class, 4405MHz, id 0x10676
        cpu6 at mainbus0 apid 3: Intel 686-class, 2505MHz, id 0x10676
        cpu7 at mainbus0 apid 7cpu7: failed to become ready

The 34 GHz for cpu2 above is impressive :)

        cpu0 at mainbus0 apid 0: Intel 686-class, 2500MHz, id 0x10676
        cpu1 at mainbus0 apid 4cpu1: failed to become ready
        cpu2 at mainbus0 apid 1cpu2: failed to become ready
        cpu3 at mainbus0 apid 5cpu3: failed to become ready
        cpu4 at mainbus0 apid 2: Intel 686-class, 2504MHz, id 0x10676
        cpu5 at mainbus0 apid 6cpu5: failed to become ready
        cpu6 at mainbus0 apid 3: Intel 686-class, 2500MHz, id 0x10676
        cpu7 at mainbus0 apid 7cpu7: failed to become ready

Most often it's one or more of the odd numbered CPUs that fail, but the
last one above had an even numbered CPU fail too.

Simon.


Home | Main Index | Thread Index | Old Index