Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Ultrasparc III+ kernel panic



BERTRAND Joël a écrit :
Julian Coleman a écrit :
Hi,

Is it possible to swap the graphics cards in these two?

First server (stable) :
- 2 * USIII/750
- XVR-500 (no screen, no keyboard)

Second server (unstable) :
- 2 * USIII+/900
- Creator-3D (text console)

The reason that I ask is that I'm seeing a number of "SIR Reset"'s
happening
on my U60, and it's a lot worse with 2 CPU's installed, or with a
Creator-3D
as the console.  With 1 CPU, I haven't seen one, and with serial console,
they happen less frequently.  I haven't been able to match the resets to
system load - they are more likely when the machine is busy (e.g. running
/etc/daily), but can also happen when it's idle.  On the other hand,
with 2
CPU's and a serial console, it managed 10 days of continuous pkgsrc
building
before resetting.

I'm not sure if this is related (different hardware) or not.  However, as
the problem is worse on the U60 with the C-3D as console, there might be
something related to UPA.

For comparison, I have an SB2000 with:

   501-6230 system board
   2 * 501-6485 1200MHz US III Cu
   501-4788 Creator-3D (console)
   375-3181 XVR-100
   2 * Fujitsu MAW3300FC in RAID 1

as my desktop, and this is stable.

     Julian,

     I have made some tests. First server (XVR500, 2*US-III/750) remains
in same configuration and is stable.

     I don't understand why I cannot put a XVR500 in second one. System
starts but screen was unusable (display takes a bad resolution and was
scrambled). In a first time, I thought my second XVR500 was dying. But I
have a third Blade 2000 in the same place and this XVR500 runs fine in
third Blade 2000... Why ? I don't know. And all PCI slot run fine in
server where XVR500 doesn't work...

     Thus, I have swapped both servers. I have tested third one with
Creator 3D (new creator3D, new memory, one new CPU, one retired from
second server...) and it hangs like second one. The last night, I have
installed XVR500 and removed Creator 3D). Now, server is stable enough
to build NetBSD from sources and pkgsrc :

load averages:  7.36,  7.20,  7.02;   up 0+11:33:42        11:54:04
145 processes: 5 runnable, 137 sleeping, 1 zombie, 2 on CPU
CPU states: 79.4% user, 0.0% nice, 20.6% system, 0.0% int, 0.0% idle
Memory: 1078M Act, 543M In, 9192K Wired, 71M Exec, 1264M File, 42M Free
Swap: 8050M Total, 143M Used, 7907M Free

     I suspect a bug somewhere in UPA support as you said.

I don't know if it is a good or bad news. Since I have replaced my Creator3D by XVR500, I haven't seen panic. But sometimes, system enters in dead lock. Console does not respond anymore, system is locked.

	Here is my dmesg :

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 7.99.5 (CUSTOM) #0: Sun Feb 22 10:19:28 CET 2015
        root%legendre.systella.fr@localhost:/usr/obj/sys/arch/sparc64/compile/CUSTOM
total memory = 2048 MB
avail memory = 1992 MB
timecounter: Timecounters tick every 10.000 msec
mainbus0 (root): SUNW,Sun-Blade-1000 (SUNW,Sun-Blade-2000): hostid 832fa91c
cpu0 at mainbus0: SUNW,UltraSPARC-III+ @ 900 MHz, CPU id 0
cpu0: manuf 3e, impl 15, mask 23
cpu0: system tick frequency 5 MHz
cpu0: 32K instruction (32 b/l), 64K data (32 b/l), 8192K external (512 b/l)
cpu1 at mainbus0: SUNW,UltraSPARC-III+ @ 900 MHz, CPU id 1
cpu1: manuf 3e, impl 15, mask 23
cpu1: system tick frequency 5 MHz
cpu1: 32K instruction (32 b/l), 64K data (32 b/l), 8192K external (512 b/l)
memory-controller at mainbus0 not configured
memory-controller at mainbus0 not configured
schizo0 at mainbus0: addr 40004700000: "Schizo", version 0, ign 200, bus B 0 to 1
schizo0:  pci0 at schizo0
pci0: i/o space, memory space enabled
ebus0 at pci0 dev 5 function 0
ebus0: vendor 108e product 1100, revision 0x01
flashprom at ebus0 addr 0-1fffff not configured
pcfiic0 at ebus0 addr 2e-2f, 2d-2d ipl 23: iic mux present
iic0 at pcfiic0: I2C bus
seeprom0 at iic0 addr 0x50: nvram: size 8192
seeprom1 at iic0 addr 0xd0: dimm-fru: size 8192
seeprom2 at iic0 addr 0xd1: dimm-fru: size 8192
seeprom3 at iic0 addr 0xd2: dimm-fru: size 8192
seeprom4 at iic0 addr 0xd3: dimm-fru: size 8192
seeprom5 at iic0 addr 0xd4: dimm-fru: size 8192
seeprom6 at iic0 addr 0xd5: dimm-fru: size 8192
seeprom7 at iic0 addr 0xd6: dimm-fru: size 8192
seeprom8 at iic0 addr 0xd7: dimm-fru: size 8192
bbc at ebus0 addr 0-fffff not configured
ppm at ebus0 addr e-28, 728000-728003, 30002e-30002f, 300600-300607 not configured
pcfiic1 at ebus0 addr 30-31 ipl 23
iic1 at pcfiic1: I2C bus
seeprom9 at iic1 addr 0x50: cpu-fru: size 8192
admtemp0 at iic1 addr 0x18: ADM1021 or compatible environmental sensor
seeprom10 at iic1 addr 0x51: cpu-fru: size 8192
admtemp1 at iic1 addr 0x4c: ADM1021 or compatible environmental sensor
tda0 at iic1 addr 0x24: fan-control
card-reader at iic1 addr 0x20 not configured
seeprom11 at iic1 addr 0x54: motherboard-fru: size 8192
i2c-bridge at iic1 addr 0x30 not configured
beep at ebus0 addr 32-37 not configured
audiocs0 at ebus0 addr 200000-2000ff, 702000-70200f, 704000-70400f, 722000-722003 ipl 20 ipl 21: CS4231A
audio0 at audiocs0: full duplex, playback, capture
rtc0 at ebus0 addr 300070-300071 ipl 24: mc146818 compatible time-of-day clock: ds1287
gpio at ebus0 addr 300600-300607 not configured
pmc at ebus0 addr 300700-300701 not configured
floppy at ebus0 addr 3023f0-3023f7, 706000-70600f, 720000-720003 ipl 25 not configured
lpt0 at ebus0 addr 300278-300287, 30002e-30002f, 700000-70000f ipl 1c
sab0 at ebus0 addr 400000-40007f ipl 22: rev 3.2
sabtty0 at sab0 port 0
sabtty1 at sab0 port 1
gem0 at pci0 dev 5 function 1: vendor 108e product 1101 (rev. 0x01)
gem0: interrupting at ivec 321d
ukphy0 at gem0 phy 1: OUI 0x0006b8, model 0x000c, rev. 1
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
gem0: Ethernet address 00:03:ba:2f:a9:1c, 2KB RX fifo, 2KB TX fifo
fwohci0 at pci0 dev 5 function 2: vendor 108e product 1102 (rev. 0x01)
fwohci0: interrupting at ivec 21e
fwohci0: OHCI version 1.0 (ROM=0)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:03:ba:ff:fe:2f:a9:1c
fwohci0: Phy 1394a available S400, 4 ports.
fwohci0: Link S400, max_rec 2048 bytes.
ieee1394if0 at fwohci0: IEEE1394 bus
fwip0 at ieee1394if0: IP over IEEE1394
fwohci0: Initiate bus reset
ohci0 at pci0 dev 5 function 3: vendor 108e product 1103 (rev. 0x01)
ohci0: interrupting at ivec 21f
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
esiop0 at pci0 dev 6 function 0: Symbios Logic 53c875 (ultra-wide scsi)
esiop0: using on-board RAM
esiop0: interrupting at ivec 1a18
scsibus0 at esiop0: 16 targets, 8 luns per target
esiop1 at pci0 dev 6 function 1: Symbios Logic 53c875 (ultra-wide scsi)
esiop1: using on-board RAMesiop1: interrupting at ivec 1a19
scsibus1 at esiop1: 16 targets, 8 luns per target
ohci1 at pci0 dev 1 function 0: vendor 1033 product 0035 (rev. 0x43)
ohci1: interrupting at ivec 20c
ohci1: OHCI version 1.0
usb1 at ohci1: USB revision 1.0
ohci2 at pci0 dev 1 function 1: vendor 1033 product 0035 (rev. 0x43)
ohci2: interrupting at ivec 20d
ohci2: OHCI version 1.0
usb2 at ohci2: USB revision 1.0
ehci0 at pci0 dev 1 function 2: vendor 1033 product 00e0 (rev. 0x04)
ehci0: interrupting at ivec 20e
ehci0: EHCI version 1.0
ehci0: companion controllers, 3 ports each: ohci1 ohci2
usb3 at ehci0: USB revision 2.0
ppb0 at pci0 dev 2 function 0: vendor 1011 product 0025 (rev. 0x04)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vendor 108e product 1000 (miscellaneous bridge, revision 0x01) at pci1 dev 0 function 0 not configured
hme0 at pci1 dev 0 function 1: Sun Happy Meal Ethernet, rev. 1
hme0: interrupting at ivec 3211
hme0: Ethernet address 00:03:ba:2f:a9:1c
qsphy0 at hme0 phy 1: QS6612 10/100 media interface, rev. 1
qsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vendor 108e product 1000 (miscellaneous bridge, revision 0x01) at pci1 dev 1 function 0 not configured
hme1 at pci1 dev 1 function 1: Sun Happy Meal Ethernet, rev. 1
hme1: interrupting at ivec 3212
hme1: Ethernet address 00:03:ba:2f:a9:1c
qsphy1 at hme1 phy 1: QS6612 10/100 media interface, rev. 1
qsphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vendor 108e product 1000 (miscellaneous bridge, revision 0x01) at pci1 dev 2 function 0 not configured
hme2 at pci1 dev 2 function 1: Sun Happy Meal Ethernet, rev. 1
hme2: interrupting at ivec 3213
hme2: Ethernet address 00:03:ba:2f:a9:1c
qsphy2 at hme2 phy 1: QS6612 10/100 media interface, rev. 1
qsphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vendor 108e product 1000 (miscellaneous bridge, revision 0x01) at pci1 dev 3 function 0 not configured
hme3 at pci1 dev 3 function 1: Sun Happy Meal Ethernet, rev. 1
hme3: interrupting at ivec 3210
hme3: Ethernet address 00:03:ba:2f:a9:1c
qsphy3 at hme3 phy 1: QS6612 10/100 media interface, rev. 1
qsphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
wcfb0 at pci0 dev 3 function 0: vendor 3d3d product 07a2 (rev. 0x01)
subsys: 10243d3d
wcfb0: 1280 x 1024, 2048
0040: 010f0101 00000005 33f01000 00a000a0
0050: 003b02b3 8ef3cffe 00000000 83ff2800
0060: 00005067 0a0076e7 00000000 03ff04ff
0070: 03ff04ff 1a0b0088 12000000 12800000
0080: 13000000 13200000 13400000 13c00000
0090: 13e00000 80148094 a0039e9b 00160400
00a0: 00160551 00160792 00160550 f6031f1f
00b0: 8bfb0c7f 47e6da8d 83be9f0b 000000ff
00c0: 3fffffff 00370000 bfffffff 90206c20
00d0: 97131814 3c173c18 4a2b0037 0000003b
00e0: 00000000 00000003 00000c75 000d0509
00f0: 92552a8b 01000005 03400280 02b00290
wsdisplay0 at wcfb0 kbdmux 1: console (default, vt100 emulation)
wsmux1: connecting to wsdisplay0
wsdisplay0: screen 1-3 added (default, vt100 emulation)
schizo1 at mainbus0: addr 40004600000: "Schizo", version 0, ign 200, bus A 0 to 0
schizo1:  pci2 at schizo1
pci2: i/o space, memory space enabled
isp0 at pci2 dev 4 function 0: QLogic FC-AL and Fabric HBA
isp0: interrupting at ivec 204
isp0: invalid NVRAM header
isp0: invalid NVRAM header
isp0: bad frame length (0) from NVRAM- using 1024
isp0: bad execution throttle of 0- using 16
mpt0 at pci2 dev 1 function 0: vendor 1000 product 0030 (rev. 0x07)
mpt0: applying 1030 quirk
mpt0: interrupting at ivec 200
scsibus2 at mpt0: 16 targets, 8 luns per target
mpt1 at pci2 dev 1 function 1: vendor 1000 product 0030 (rev. 0x07)
mpt1: applying 1030 quirk
mpt1: interrupting at ivec 201
scsibus3 at mpt1: 16 targets, 8 luns per target
upa0 at mainbus0
ppm at mainbus0 not configured
pcons at mainbus0 not configured
fwohci0: BUS reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
ieee1394if0: bus manager 0
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "tick-counter" frequency 900000000 Hz quality 100
timecounter: Timecounter "stick-counter" frequency 5000000 Hz quality 200
No counter-timer -- using %stick at 5MHz as system clock.
scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
scsibus2: waiting 2 seconds for devices to settle...
scsibus3: waiting 2 seconds for devices to settle...
uhub0 at usb0: vendor 108e OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
uhub1 at usb1: vendor 1033 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
scsibus4 at isp0: 256 targets, 8 luns per target
uhub2 at usb2: vendor 1033 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhub3 at usb3: vendor 1033 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub3: 5 ports with 5 removable, self powered
scsibus4: waiting 2 seconds for devices to settle...
ugen0 at uhub0 port 3
cd0 at scsibus0 target 6 lun 0: <TOSHIBA, DVD-ROM SD-M1401, 1007> cdrom removable
cd0: sync (50.00ns offset 16), 8-bit (20.000MB/s) transfers
ugen0: American Power Conversion Back-UPS BR 800 FW:9.o2 .I USB FW:o2, rev 1.10/1.06, addr 2
uhidev0 at uhub0 port 4 configuration 1 interface 0
uhidev0: vendor 0430 product 0005, rev 1.10/2.00, addr 3, iclass 3/1
ukbd0 at uhidev0: 8 modifier keys, 6 key codes
wskbd1 at ukbd0: console keyboard, using wsdisplay0
sd2 at scsibus2 target 0 lun 0: <FUJITSU, MAW3073NC, 0104> disk fixed
sd2: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd2: sync (6.25ns offset 127), 16-bit (320.000MB/s) transfers, tagged queueing
sd3 at scsibus2 target 1 lun 0: <FUJITSU, MAW3073NC, 0104> disk fixed
sd3: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd3: sync (6.25ns offset 127), 16-bit (320.000MB/s) transfers, tagged queueing
sd4 at scsibus2 target 2 lun 0: <FUJITSU, MAW3073NC, 0104> disk fixed
sd4: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd4: sync (6.25ns offset 127), 16-bit (320.000MB/s) transfers, tagged queueing
sd5 at scsibus2 target 3 lun 0: <FUJITSU, MAW3073NC, 0104> disk fixed
sd5: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd5: sync (6.25ns offset 127), 16-bit (320.000MB/s) transfers, tagged queueing
sd6 at scsibus2 target 4 lun 0: <FUJITSU, MAW3073NC, 0104> disk fixed
sd6: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd6: sync (6.25ns offset 127), 16-bit (320.000MB/s) transfers, tagged queueing
sd7 at scsibus2 target 5 lun 0: <FUJITSU, MAW3073NC, 0104> disk fixed
sd7: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd7: sync (6.25ns offset 127), 16-bit (320.000MB/s) transfers, tagged queueing
sd8 at scsibus2 target 6 lun 0: <FUJITSU, MAW3073NC, 0104> disk fixed
sd8: 70136 MB, 78753 cyl, 2 head, 911 sec, 512 bytes/sect x 143638992 sectors sd8: sync (6.25ns offset 127), 16-bit (320.000MB/s) transfers, tagged queueing cd1 at scsibus2 target 15 lun 0: <PLEXTOR, DVDR PX-760A, 1.03> cdrom removable
sd0 at scsibus4 target 0 lun 0: <SEAGATE, ST3300655FC, 0003> disk fixed
sd0: 279 GB, 74340 cyl, 8 head, 985 sec, 512 bytes/sect x 585937500 sectors
sd1 at scsibus4 target 1 lun 0: <SEAGATE, ST3300655FC, 0003> disk fixed
sd1: 279 GB, 74340 cyl, 8 head, 985 sec, 512 bytes/sect x 585937500 sectors
Kernelized RAIDframe activated
raid1: RAID Level 5
raid1: Components: /dev/sd2a /dev/sd3a /dev/sd4a /dev/sd5a /dev/sd6a /dev/sd7a /dev/sd8a
raid1: Total Sectors: 861833472 (420817 MB)
raid1: GPT GUID: f0748003-fce1-11e2-8381-0003ba29c43a
dk0 at raid1: f074807d-fce1-11e2-8381-0003ba29c43a
dk0: 861833405 blocks at 34, type: ffs
raid0: RAID Level 1
raid0: Components: /dev/sd0a /dev/sd1a
raid0: Total Sectors: 585937408 (286102 MB)
root on raid0a dumps on raid0b
root file system type: ffs
kern.module.path=/stand/sparc64/7.99.5/modules
raid0: Device already configured!
raid1: Device already configured!
wsdisplay0: screen 4 added (default, vt100 emulation)

With the same kernel, another blade2000 is stable but it does not contain USB2, quad hme and LSI U320 (mpt).

	Regards,

	JKB


Home | Main Index | Thread Index | Old Index