Good try (new bios) but didn't help, after N+1 fsck's on a 1tb raid, I
decided to downgrade.... (sigh) but now I have an empty playbox which still crashes!!!! I am able to isolate the problem only non-raw accesses of >1m blocksize causes the system to crash: dd if=/dev/wd0d of=/dev/null bs=1040k count=100 --- DOES NOT CORE dd if=/dev/wd0d of=/dev/null bs=1200k count=100 --- DOES CORE !!!! dd if=/dev/rwd0d of=/dev/null bs=1200k count=100 --- is fine too. ( Raw device ) dd if=/dev/wd1d of=/dev/null bs=2000k count=100 --- is fine ( P-ATA drive ) wd0 is a 1tb sata drive (ffs - no raid) Any expert got any ideas? (I used the 5_0 iso from ftp.netbsd) I have a core dump of the crashed system (ask and I put it somewhere) - below is a dmesg before the crash. it very reliable produces "ohci4: 8 scheduling overruns" messages before crashing. So I disagree with the flaky memory timing hypothesis and I think it is a bug in the SATA/kernel area . (If someone wants access I can open an ssh port to the box and I can install a serial port redirector to the console plus a CVS checkout ) thanks for any help. thilo Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 The NetBSD Foundation, Inc. All rights reserved. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. NetBSD 5.0_STABLE (GENERIC) #0: Tue Jun 2 21:21:05 UTC 2009 builds%b3.netbsd.org@localhost:/home/builds/ab/netbsd-5/i386/200906020000Z-obj/home/builds/ab/netbsd-5/src/sys/arch/i386/compile/GENERIC total memory = 1791 MB avail memory = 1748 MB timecounter: Timecounters tick every 10.000 msec timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100 System manufacturer System Product Name (System Version) mainbus0 (root) mainbus0: Intel MP Specification (Version 1.4) (ASUS ) cpu0 at mainbus0 apid 0: AMD 686-class, 2707MHz, id 0x100f23 cpu1 at mainbus0 apid 1: AMD 686-class, 2707MHz, id 0x100f23 mpbios: bus 0 is type PCI mpbios: bus 1 is type PCI mpbios: bus 2 is type PCI mpbios: bus 3 is type PCI mpbios: bus 4 is type ISA ioapic0 at mainbus0 apid 2: pa 0xfec00000, version 21, 24 pins pci0 at mainbus0 bus 0: configuration mode 1 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok pchb0 at pci0 dev 0 function 0 pchb0: vendor 0x1022 product 0x9600 (rev. 0x00) ppb0 at pci0 dev 1 function 0: vendor 0x1043 product 0x9602 (rev. 0x00) pci1 at ppb0 bus 1 pci1: i/o space, memory space enabled vga1 at pci1 dev 5 function 0: vendor 0x1002 product 0x9610 (rev. 0x00) wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation) wsmux1: connecting to wsdisplay0 drm at vga1 not configured azalia0 at pci1 dev 5 function 1: Generic High Definition Audio Controller azalia0: interrupting at ioapic0 pin 19 azalia0: host: 0x1002/0x960f (rev. 0), HDA rev. 1.0 ppb1 at pci0 dev 6 function 0: vendor 0x1022 product 0x9606 (rev. 0x00) ppb1: unsupported PCI Express version pci2 at ppb1 bus 2 pci2: i/o space, memory space enabled, rd/line, wr/inv ok re0 at pci2 dev 0 function 0: RealTek 8168B/8111B PCIe Gigabit Ethernet (rev. 0x01) re0: interrupting at ioapic0 pin 18 re0: Ethernet address 00:24:8c:c8:7f:9e re0: using 256 tx descriptors rgephy0 at re0 phy 7: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 2 rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto ixpide0 at pci0 dev 17 function 0 ixpide0: ATI Technologies IXP IDE Controller (rev. 0x00) ixpide0: bus-master DMA support present ixpide0: primary channel configured to native-PCI mode ixpide0: using ioapic0 pin 22 for native-PCI interrupt atabus0 at ixpide0 channel 0 ixpide0: secondary channel configured to native-PCI mode atabus1 at ixpide0 channel 1 ohci0 at pci0 dev 18 function 0: vendor 0x1002 product 0x4397 (rev. 0x00) ohci0: interrupting at ioapic0 pin 16 ohci0: OHCI version 1.0, legacy support usb0 at ohci0: USB revision 1.0 ohci1 at pci0 dev 18 function 1: vendor 0x1002 product 0x4398 (rev. 0x00) ohci1: interrupting at ioapic0 pin 16 ohci1: OHCI version 1.0, legacy support usb1 at ohci1: USB revision 1.0 ehci0 at pci0 dev 18 function 2: vendor 0x1002 product 0x4396 (rev. 0x00) ehci0: interrupting at ioapic0 pin 17 ehci0: dropped intr workaround enabled ehci0: EHCI version 1.0 ehci0: companion controllers, 3 ports each: ohci0 ohci1 usb2 at ehci0: USB revision 2.0 ohci2 at pci0 dev 19 function 0: vendor 0x1002 product 0x4397 (rev. 0x00) ohci2: interrupting at ioapic0 pin 18 ohci2: OHCI version 1.0, legacy support usb3 at ohci2: USB revision 1.0 ohci3 at pci0 dev 19 function 1: vendor 0x1002 product 0x4398 (rev. 0x00) ohci3: interrupting at ioapic0 pin 18 ohci3: OHCI version 1.0, legacy support usb4 at ohci3: USB revision 1.0 ehci1 at pci0 dev 19 function 2: vendor 0x1002 product 0x4396 (rev. 0x00) ehci1: interrupting at ioapic0 pin 19 ehci1: dropped intr workaround enabled ehci1: EHCI version 1.0 ehci1: companion controllers, 3 ports each: ohci2 ohci3 usb5 at ehci1: USB revision 2.0 piixpm0 at pci0 dev 20 function 0 piixpm0: vendor 0x1002 product 0x4385 (rev. 0x3a) piixpm0: interrupting at SMIpiixpm0: polling iic0 at piixpm0: I2C bus ixpide1 at pci0 dev 20 function 1 ixpide1: ATI Technologies IXP IDE Controller (rev. 0x00) ixpide1: bus-master DMA support present ixpide1: primary channel configured to compatibility mode ixpide1: primary channel interrupting at ioapic0 pin 14 atabus2 at ixpide1 channel 0 ixpide1: secondary channel configured to compatibility mode ixpide1: secondary channel interrupting at ioapic0 pin 15 atabus3 at ixpide1 channel 1 azalia1 at pci0 dev 20 function 2: Generic High Definition Audio Controller azalia1: interrupting at ioapic0 pin 16 azalia1: host: 0x1002/0x4383 (rev. 0), HDA rev. 1.0 pcib0 at pci0 dev 20 function 3 pcib0: vendor 0x1002 product 0x439d (rev. 0x00) ppb2 at pci0 dev 20 function 4: vendor 0x1002 product 0x4384 (rev. 0x00) pci3 at ppb2 bus 3 pci3: i/o space enabled ohci4 at pci0 dev 20 function 5: vendor 0x1002 product 0x4399 (rev. 0x00) ohci4: interrupting at ioapic0 pin 18 ohci4: OHCI version 1.0, legacy support usb6 at ohci4: USB revision 1.0 pchb1 at pci0 dev 24 function 0 pchb1: vendor 0x1022 product 0x1200 (rev. 0x00) pchb2 at pci0 dev 24 function 1 pchb2: vendor 0x1022 product 0x1201 (rev. 0x00) pchb3 at pci0 dev 24 function 2 pchb3: vendor 0x1022 product 0x1202 (rev. 0x00) amdtemp0 at pci0 dev 24 function 3 amdtemp0: AMD CPU Temperature Sensors (Family10h / Family11h) pchb4 at pci0 dev 24 function 4 pchb4: vendor 0x1022 product 0x1204 (rev. 0x00) isa0 at pcib0 lpt0 at isa0 port 0x378-0x37b irq 7 com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo pckbc0 at isa0 port 0x60-0x64 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard, using wsdisplay0 attimer0 at isa0 port 0x40-0x43: AT Timer pcppi0 at isa0 port 0x61 midi0 at pcppi0: PC speaker (CPU-intensive output) sysbeep0 at pcppi0 npx0 at isa0 port 0xf0-0xff npx0: reported by CPUID; using exception 16 attimer0: attached to pcppi0 timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0 timecounter: Timecounter "TSC" frequency 2707877010 Hz quality 3000 azalia0: codec[0]: ATI RS690/780 HDMI (rev. 0.0), HDA rev. 1.0 audio0 at azalia0: full duplex, independent azalia1: codec[0]: 0x10ec/0x0887 (rev. 2.2), HDA rev. 1.0 audio1 at azalia1: full duplex, independent uhub0 at usb0: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 3 ports with 3 removable, self powered uhub1 at usb1: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered uhub2 at usb2: vendor 0x1002 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub2: 6 ports with 6 removable, self powered uhub3 at usb3: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 3 ports with 3 removable, self powered uhub4 at usb4: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub4: 3 ports with 3 removable, self powered uhub5 at usb5: vendor 0x1002 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub5: 6 ports with 6 removable, self powered uhub6 at usb6: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub6: 2 ports with 2 removable, self powered wd0 at atabus0 drive 0: <ST31000333AS> wd0: quirks 2<FORCE_LBA48> wd0: drive supports 16-sector PIO transfers, LBA48 addressing wd0: 931 GB, 1938021 cyl, 16 head, 63 sec, 512 bytes/sect x 1953525168 sectors wd0: 32-bit data port wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) wd0(ixpide0:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA) wd1 at atabus2 drive 0: <FUJITSU MPF3102AT> wd1: drive supports 16-sector PIO transfers, LBA addressing wd1: 9773 MB, 19857 cyl, 16 head, 63 sec, 512 bytes/sect x 20015856 sectors wd1: 32-bit data port wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66) wd1(ixpide1:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA) Kernelized RAIDframe activated pad0: outputs: 44100Hz, 16-bit, stereo audio2 at pad0: half duplex boot device: wd1 root on wd1a dumps on wd1b root file system type: ffs wsdisplay0: screen 1 added (80x25, vt100 emulation) wsdisplay0: screen 2 added (80x25, vt100 emulation) wsdisplay0: screen 3 added (80x25, vt100 emulation) wsdisplay0: screen 4 added (80x25, vt100 emulation) Miles Nordin wrote: "tj" == Thilo Jeremias <thilo%nispuk.com@localhost> writes:tj> Asus M4A78-VM for these Phenom boards, always run a BIOS that was released a couple months after the CPU you are trying to use. I've had weird intermittent problems on two different boards both fixed by doing that. The software burned onto the board is always stale, even if it's willing to boot the newer CPU. (problems like, crash at inconsistent places during bootup, or randomly powerdown for no reason after 1 - 7 days irrespective of load) Other people on wikipedia and AOLeet gamerz forums may advise you that the newest BIOS's on old boards tend to be the work of junior developers and accumulate regressions, so it may be a balancing act between rare intermittent problems and subtle regressions. Another thing about which I've been trying to spread the word for AMD users: http://hyvatti.iki.fi/~jaakko/sw/ |