Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Help needed - NetBSD 5.0 - generic Apr-26 fails in fsck ..



Good try (new bios) but didn't help, after N+1 fsck's on a 1tb raid, I decided to downgrade.... (sigh)

but now I have an empty playbox which still crashes!!!!

I am able to isolate the problem only non-raw accesses of >1m blocksize causes the system to crash:

dd if=/dev/wd0d of=/dev/null bs=1040k count=100    --- DOES NOT CORE
dd if=/dev/wd0d of=/dev/null bs=1200k  count=100  --- DOES CORE !!!!
dd if=/dev/rwd0d of=/dev/null bs=1200k  count=100  --- is fine too.   ( Raw device )
dd if=/dev/wd1d of=/dev/null bs=2000k count=100 --- is fine ( P-ATA drive )
wd0 is a 1tb sata drive (ffs - no raid)

Any expert got any ideas?
(I used the 5_0 iso from ftp.netbsd)
I have a core dump of the crashed system (ask and I put it somewhere) - below is a dmesg before the crash.
it very reliable produces "ohci4: 8 scheduling overruns" messages before crashing.

So I disagree with the flaky memory timing hypothesis and I think it is a bug in the SATA/kernel  area .
(If someone wants access I can open an ssh port to the box and I can install a serial port redirector to the console plus a CVS checkout )

thanks for any help.

thilo

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 5.0_STABLE (GENERIC) #0: Tue Jun  2 21:21:05 UTC 2009
    builds%b3.netbsd.org@localhost:/home/builds/ab/netbsd-5/i386/200906020000Z-obj/home/builds/ab/netbsd-5/src/sys/arch/i386/compile/GENERIC
total memory = 1791 MB
avail memory = 1748 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
System manufacturer System Product Name (System Version)
mainbus0 (root)
mainbus0: Intel MP Specification (Version 1.4) (ASUS                 )
cpu0 at mainbus0 apid 0: AMD 686-class, 2707MHz, id 0x100f23
cpu1 at mainbus0 apid 1: AMD 686-class, 2707MHz, id 0x100f23
mpbios: bus 0 is type PCI  
mpbios: bus 1 is type PCI  
mpbios: bus 2 is type PCI  
mpbios: bus 3 is type PCI  
mpbios: bus 4 is type ISA  
ioapic0 at mainbus0 apid 2: pa 0xfec00000, version 21, 24 pins
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: vendor 0x1022 product 0x9600 (rev. 0x00)
ppb0 at pci0 dev 1 function 0: vendor 0x1043 product 0x9602 (rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vga1 at pci1 dev 5 function 0: vendor 0x1002 product 0x9610 (rev. 0x00)
wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
drm at vga1 not configured
azalia0 at pci1 dev 5 function 1: Generic High Definition Audio Controller
azalia0: interrupting at ioapic0 pin 19
azalia0: host: 0x1002/0x960f (rev. 0), HDA rev. 1.0
ppb1 at pci0 dev 6 function 0: vendor 0x1022 product 0x9606 (rev. 0x00)
ppb1: unsupported PCI Express version
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
re0 at pci2 dev 0 function 0: RealTek 8168B/8111B PCIe Gigabit Ethernet (rev. 0x01)
re0: interrupting at ioapic0 pin 18
re0: Ethernet address 00:24:8c:c8:7f:9e
re0: using 256 tx descriptors
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 2
rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ixpide0 at pci0 dev 17 function 0
ixpide0: ATI Technologies IXP IDE Controller (rev. 0x00)
ixpide0: bus-master DMA support present
ixpide0: primary channel configured to native-PCI mode
ixpide0: using ioapic0 pin 22 for native-PCI interrupt
atabus0 at ixpide0 channel 0
ixpide0: secondary channel configured to native-PCI mode
atabus1 at ixpide0 channel 1
ohci0 at pci0 dev 18 function 0: vendor 0x1002 product 0x4397 (rev. 0x00)
ohci0: interrupting at ioapic0 pin 16
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
ohci1 at pci0 dev 18 function 1: vendor 0x1002 product 0x4398 (rev. 0x00)
ohci1: interrupting at ioapic0 pin 16
ohci1: OHCI version 1.0, legacy support
usb1 at ohci1: USB revision 1.0
ehci0 at pci0 dev 18 function 2: vendor 0x1002 product 0x4396 (rev. 0x00)
ehci0: interrupting at ioapic0 pin 17
ehci0: dropped intr workaround enabled
ehci0: EHCI version 1.0
ehci0: companion controllers, 3 ports each: ohci0 ohci1
usb2 at ehci0: USB revision 2.0
ohci2 at pci0 dev 19 function 0: vendor 0x1002 product 0x4397 (rev. 0x00)
ohci2: interrupting at ioapic0 pin 18
ohci2: OHCI version 1.0, legacy support
usb3 at ohci2: USB revision 1.0
ohci3 at pci0 dev 19 function 1: vendor 0x1002 product 0x4398 (rev. 0x00)
ohci3: interrupting at ioapic0 pin 18
ohci3: OHCI version 1.0, legacy support
usb4 at ohci3: USB revision 1.0
ehci1 at pci0 dev 19 function 2: vendor 0x1002 product 0x4396 (rev. 0x00)
ehci1: interrupting at ioapic0 pin 19
ehci1: dropped intr workaround enabled
ehci1: EHCI version 1.0
ehci1: companion controllers, 3 ports each: ohci2 ohci3
usb5 at ehci1: USB revision 2.0
piixpm0 at pci0 dev 20 function 0
piixpm0: vendor 0x1002 product 0x4385 (rev. 0x3a)
piixpm0: interrupting at SMIpiixpm0: polling
iic0 at piixpm0: I2C bus
ixpide1 at pci0 dev 20 function 1
ixpide1: ATI Technologies IXP IDE Controller (rev. 0x00)
ixpide1: bus-master DMA support present
ixpide1: primary channel configured to compatibility mode
ixpide1: primary channel interrupting at ioapic0 pin 14
atabus2 at ixpide1 channel 0
ixpide1: secondary channel configured to compatibility mode
ixpide1: secondary channel interrupting at ioapic0 pin 15
atabus3 at ixpide1 channel 1
azalia1 at pci0 dev 20 function 2: Generic High Definition Audio Controller
azalia1: interrupting at ioapic0 pin 16
azalia1: host: 0x1002/0x4383 (rev. 0), HDA rev. 1.0
pcib0 at pci0 dev 20 function 3
pcib0: vendor 0x1002 product 0x439d (rev. 0x00)
ppb2 at pci0 dev 20 function 4: vendor 0x1002 product 0x4384 (rev. 0x00)
pci3 at ppb2 bus 3
pci3: i/o space enabled
ohci4 at pci0 dev 20 function 5: vendor 0x1002 product 0x4399 (rev. 0x00)
ohci4: interrupting at ioapic0 pin 18
ohci4: OHCI version 1.0, legacy support
usb6 at ohci4: USB revision 1.0
pchb1 at pci0 dev 24 function 0
pchb1: vendor 0x1022 product 0x1200 (rev. 0x00)
pchb2 at pci0 dev 24 function 1
pchb2: vendor 0x1022 product 0x1201 (rev. 0x00)
pchb3 at pci0 dev 24 function 2
pchb3: vendor 0x1022 product 0x1202 (rev. 0x00)
amdtemp0 at pci0 dev 24 function 3
amdtemp0: AMD CPU Temperature Sensors (Family10h / Family11h)
pchb4 at pci0 dev 24 function 4
pchb4: vendor 0x1022 product 0x1204 (rev. 0x00)
isa0 at pcib0
lpt0 at isa0 port 0x378-0x37b irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
attimer0 at isa0 port 0x40-0x43: AT Timer
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker (CPU-intensive output)
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0-0xff
npx0: reported by CPUID; using exception 16
attimer0: attached to pcppi0
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "TSC" frequency 2707877010 Hz quality 3000
azalia0: codec[0]: ATI RS690/780 HDMI (rev. 0.0), HDA rev. 1.0
audio0 at azalia0: full duplex, independent
azalia1: codec[0]: 0x10ec/0x0887 (rev. 2.2), HDA rev. 1.0
audio1 at azalia1: full duplex, independent
uhub0 at usb0: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 3 ports with 3 removable, self powered
uhub1 at usb1: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
uhub2 at usb2: vendor 0x1002 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: 6 ports with 6 removable, self powered
uhub3 at usb3: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 3 ports with 3 removable, self powered
uhub4 at usb4: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub4: 3 ports with 3 removable, self powered
uhub5 at usb5: vendor 0x1002 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub5: 6 ports with 6 removable, self powered
uhub6 at usb6: vendor 0x1002 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub6: 2 ports with 2 removable, self powered
wd0 at atabus0 drive 0: <ST31000333AS>
wd0: quirks 2<FORCE_LBA48>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 931 GB, 1938021 cyl, 16 head, 63 sec, 512 bytes/sect x 1953525168 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0(ixpide0:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA)
wd1 at atabus2 drive 0: <FUJITSU MPF3102AT>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 9773 MB, 19857 cyl, 16 head, 63 sec, 512 bytes/sect x 20015856 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
wd1(ixpide1:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
Kernelized RAIDframe activated
pad0: outputs: 44100Hz, 16-bit, stereo
audio2 at pad0: half duplex
boot device: wd1
root on wd1a dumps on wd1b
root file system type: ffs
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)






Miles Nordin wrote:
"tj" == Thilo Jeremias <thilo%nispuk.com@localhost> writes:
            

    tj> Asus M4A78-VM 

for these Phenom boards, always run a BIOS that was released a couple
months after the CPU you are trying to use.  I've had weird
intermittent problems on two different boards both fixed by doing
that.  The software burned onto the board is always stale, even if
it's willing to boot the newer CPU.

(problems like, crash at inconsistent places during bootup, or
randomly powerdown for no reason after 1 - 7 days irrespective of
load)

Other people on wikipedia and AOLeet gamerz forums may advise you that
the newest BIOS's on old boards tend to be the work of junior
developers and accumulate regressions, so it may be a balancing act
between rare intermittent problems and subtle regressions.

Another thing about which I've been trying to spread the word for AMD
users:

  http://hyvatti.iki.fi/~jaakko/sw/
  


Home | Main Index | Thread Index | Old Index