NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/49186: Network hangs when using write cache in amr(4) RAID controller



>Number:         49186
>Category:       kern
>Synopsis:       Network hangs when using write cache in amr(4) RAID controller
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Sep 08 17:00:00 +0000 2014
>Originator:     Tom Ivar Helbekkmo
>Release:        NetBSD 7.99.1 (2014-08-29)
>Organization:
>Environment:
NetBSD 7.99.1 (BARSOOM) #29: Sat Aug 30 17:02:35 CEST 2014
        
root%barsoom.hamartun.priv.no@localhost:/usr/obj/sys/arch/amd64/compile.amd64/BARSOOM
total memory = 4095 MB
avail memory = 3956 MB
kern.module.path=/stand/amd64/7.99.1/modules
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Dell Computer Corporation PowerEdge 2850
mainbus0 (root)
ACPI: RSDP 0xfd5b0 000014 (v00 DELL  )
ACPI: RSDT 0xfd5c4 000038 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: FACP 0xfd620 000074 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: DSDT 0xbffc0000 003CCD (v01 DELL   PE BKC   00000001 MSFT 0100000E)
ACPI: FACS 0xbffcfc00 000040
ACPI: APIC 0xfd694 0000E0 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: SPCR 0xfd774 000050 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: HPET 0xfd7c4 000038 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: MCFG 0xfd7fc 00003C (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: All ACPI Tables successfully acquired
cpu0 at mainbus0 apid 0: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
cpu1 at mainbus0 apid 6: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
cpu2 at mainbus0 apid 1: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
cpu3 at mainbus0 apid 7: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
ioapic0 at mainbus0 apid 8: pa 0xfec00000, version 0x20, 24 pins
ioapic1 at mainbus0 apid 9: pa 0xfec80000, version 0x20, 24 pins
ioapic2 at mainbus0 apid 10: pa 0xfec83000, version 0x20, 24 pins
ioapic3 at mainbus0 apid 11: pa 0xfec84000, version 0x20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20131218
acpi0: X/RSDT: OemId <DELL  ,PE BKC  ,00000001>, AslId <MSFT,0100000a>
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
pcppi1 at acpi0 (SPK, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x5f irq 0
FDC (PNP0700) at acpi0 not configured
COMA (PNP0501) at acpi0 not configured
MBIO (PNP0C01) at acpi0 not configured
NIPM (IPI0001) at acpi0 not configured
acpivga0 at acpi0 (EVGA): ACPI Display Adapter
PEHB (PNP0C02) at acpi0 not configured
ACPI: Enabled 1 GPEs in block 00 to 1F
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] 
(20131218/hwxface-646)
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] 
(20131218/hwxface-646)
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S3_] 
(20131218/hwxface-646)
attimer1: attached to pcppi1
ipmi0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x3590 (rev. 0x09)
ppb0 at pci0 dev 2 function 0: vendor 0x8086 product 0x3595 (rev. 0x09)
ppb0: PCI Express 1.0 <Root Port of PCI-E Root Complex> x8 @ 2.5Gb/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci1 dev 0 function 0: vendor 0x8086 product 0x0330 (rev. 0x06)
ppb1: PCI Express 1.0 <PCI-E to PCI/PCI-X Bridge>
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
amr0 at pci2 dev 14 function 0: AMI RAID <PERC 4e/Di>
amr0: interrupting at ioapic1 pin 14
amr0: firmware 5B2D, BIOS H435, 256MB RAM
ld0 at amr0 unit 0: RAID 1, optimal
ld0: 69880 MB, 8908 cyl, 255 head, 63 sec, 512 bytes/sect x 143114240 sectors
ld1 at amr0 unit 1: RAID 1, optimal
ld1: 69880 MB, 8908 cyl, 255 head, 63 sec, 512 bytes/sect x 143114240 sectors
ld2 at amr0 unit 2: RAID 1, degraded
ld2: 136 GB, 17834 cyl, 255 head, 63 sec, 512 bytes/sect x 286515200 sectors
ppb2 at pci1 dev 0 function 2: vendor 0x8086 product 0x0332 (rev. 0x06)
ppb2: PCI Express 1.0 <PCI-E to PCI/PCI-X Bridge>
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
ppb3 at pci0 dev 4 function 0: vendor 0x8086 product 0x3597 (rev. 0x09)
ppb3: PCI Express 1.0 <Root Port of PCI-E Root Complex> x8 @ 2.5Gb/s
pci4 at ppb3 bus 4
pci4: i/o space, memory space enabled, rd/line, wr/inv ok
ppb4 at pci0 dev 5 function 0: vendor 0x8086 product 0x3598 (rev. 0x09)
ppb4: PCI Express 1.0 <Root Port of PCI-E Root Complex> x4 @ 2.5Gb/s
pci5 at ppb4 bus 5
pci5: i/o space, memory space enabled, rd/line, wr/inv ok
ppb5 at pci5 dev 0 function 0: vendor 0x8086 product 0x0329 (rev. 0x09)
ppb5: PCI Express 1.0 <PCI-E to PCI/PCI-X Bridge>
pci6 at ppb5 bus 6
pci6: i/o space, memory space enabled, rd/line, wr/inv ok
wm0 at pci6 dev 7 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x05)
wm0: interrupting at ioapic2 pin 0
wm0: 32-bit 66MHz PCI bus
wm0: 65536 word (16 address bits) SPI EEPROM
wm0: Ethernet address 00:13:72:f7:00:06
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
ppb6 at pci5 dev 0 function 2: vendor 0x8086 product 0x032a (rev. 0x09)
ppb6: PCI Express 1.0 <PCI-E to PCI/PCI-X Bridge>
pci7 at ppb6 bus 7
pci7: i/o space, memory space enabled, rd/line, wr/inv ok
wm1 at pci7 dev 8 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x05)
wm1: interrupting at ioapic2 pin 1
wm1: 32-bit 66MHz PCI bus
wm1: 65536 word (16 address bits) SPI EEPROM
wm1: Ethernet address 00:13:72:f7:00:07
igphy1 at wm1 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
ppb7 at pci0 dev 6 function 0: vendor 0x8086 product 0x3599 (rev. 0x09)
ppb7: PCI Express 1.0 <Root Port of PCI-E Root Complex> x8 @ 2.5Gb/s
pci8 at ppb7 bus 8
pci8: i/o space, memory space enabled, rd/line, wr/inv ok
ppb8 at pci8 dev 0 function 0: vendor 0x8086 product 0x0329 (rev. 0x09)
ppb8: PCI Express 1.0 <PCI-E to PCI/PCI-X Bridge>
pci9 at ppb8 bus 9
pci9: i/o space, memory space enabled, rd/line, wr/inv ok
ppb9 at pci8 dev 0 function 2: vendor 0x8086 product 0x032a (rev. 0x09)
ppb9: PCI Express 1.0 <PCI-E to PCI/PCI-X Bridge>
pci10 at ppb9 bus 10
pci10: i/o space, memory space enabled, rd/line, wr/inv ok
skc0 at pci10 dev 3 function 0: ioapic3 pin 5
skc0: interrupt moderation is 0 us
skc0: 3Com Gigabit NIC (3C2000) rev. (0x1)
sk0 at skc0 port A: Ethernet address 00:0a:5e:1a:a0:c3
makphy0 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 3
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
uhci0 at pci0 dev 29 function 0: vendor 0x8086 product 0x24d2 (rev. 0x02)
uhci0: interrupting at ioapic0 pin 16
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 29 function 1: vendor 0x8086 product 0x24d4 (rev. 0x02)
uhci1: interrupting at ioapic0 pin 19
usb1 at uhci1: USB revision 1.0
uhci2 at pci0 dev 29 function 2: vendor 0x8086 product 0x24d7 (rev. 0x02)
uhci2: interrupting at ioapic0 pin 18
usb2 at uhci2: USB revision 1.0
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x24dd (rev. 0x02)
ehci0: interrupting at ioapic0 pin 23
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2
usb3 at ehci0: USB revision 2.0
ppb10 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0xc2)
pci11 at ppb10 bus 11
pci11: i/o space, memory space enabled
vendor 0x1028 product 0x0011 (undefined, subclass 0x00) at pci11 dev 5 function 
0 not configured
vendor 0x1028 product 0x0012 (undefined, subclass 0x00) at pci11 dev 5 function 
1 not configured
vendor 0x1028 product 0x0014 (undefined, subclass 0x00) at pci11 dev 5 function 
2 not configured
cmdide0 at pci11 dev 6 function 0: Silicon Image 0680 (rev. 0x02)
cmdide0: bus-master DMA support present
cmdide0: primary channel wired to native-PCI mode
cmdide0: using ioapic0 pin 23 for native-PCI interrupt
atabus0 at cmdide0 channel 0
cmdide0: secondary channel wired to native-PCI mode
atabus1 at cmdide0 channel 1
vga0 at pci11 dev 13 function 0: vendor 0x1002 product 0x5159 (rev. 0x00)
wsdisplay0 at vga0 kbdmux 1
wsmux1: connecting to wsdisplay0
radeondrm0 at vga0: ATI Radeon QY RV100 7000/VE
radeondrm0: Initialized radeon 1.29.0 20080613
ichlpcib0 at pci0 dev 31 function 0: vendor 0x8086 product 0x24d0 (rev. 0x02)
timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
ichlpcib0: 24-bit timer
ichlpcib0: TCO (watchdog) timer configured.
piixide0 at pci0 dev 31 function 1: Intel 82801EB IDE Controller (ICH5) (rev. 
0x02)
piixide0: I/O access disabled at device
isa0 at ichlpcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
pckbc0 at isa0 port 0x60-0x64
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: HLT, lat   0 us, pow     0 mW
acpicpu1 at cpu1: ACPI CPU
acpicpu2 at cpu2: ACPI CPU
acpicpu3 at cpu3: ACPI CPU
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
IPsec: Initialized Security Association Processing.
uhub0 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb2: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub2 at usb3: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: 6 ports with 6 removable, self powered
atapibus0 at atabus0: 2 targets
uhub3 at usb1: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
sd0 at atapibus0 drive 0: <VIRTUALFLOPPY DRIVE               Flopp, , > disk 
removable
dk0 at ld2: Local/e
dk0: 286515137 blocks at 63, type: 4.2BSD
dk1 at ld0: System/a
dk1: 2104515 blocks at 63, type: 4.2BSD
dk2 at ld0: System/b
dk2: 16787925 blocks at 2104578, type: swap
dk3 at ld0: System/e
dk3: 90654795 blocks at 18892503, type: 4.2BSD
dk4 at ld0: System/f
dk4: 33566942 blocks at 109547298, type: 4.2BSD
sd0: drive offline
dk5 at ld1: User/e
dk5: 71557088 blocks at 63, type: 4.2BSD
dk6 at ld1: User/f
dk6: 71557089 blocks at 71557151, type: 4.2BSD
sd0: 32-bit data port
sd0: drive supports PIO mode 3
cd0 at atapibus0 drive 1: <VIRTUALCDROM DRIVE, , > cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 3
sd0(cmdide0:0:0): using PIO mode 3
cd0(cmdide0:0:1): using PIO mode 3
ehci0: handing over full speed device on port 1 to uhci0
uhub4 at uhub2 port 3: vendor 0x413c product 0xa001, class 9/0, rev 2.00/0.00, 
addr 2
uhub4: multiple transaction translators
uhub4: 2 ports with 2 removable, self powered
Kernelized RAIDframe activated
pad0: outputs: 44100Hz, 16-bit, stereo
audio0 at pad0: half duplex, playback, capture
boot device: ld0
root on dk1 dumps on dk2
root file system type: ffs
uhidev0 at uhub0 port 1 configuration 1 interface 0
uhidev0: Dell DRAC4, rev 1.10/0.00, addr 2, iclass 3/1
ukbd0 at uhidev0: 8 modifier keys, 6 key codes
wskbd0 at ukbd0 mux 1
wskbd0: connecting to wsdisplay0
uhidev1 at uhub0 port 1 configuration 1 interface 1
uhidev1: Dell DRAC4, rev 1.10/0.00, addr 2, iclass 3/1
ums0 at uhidev1: 3 buttons and Z dir
wsmouse0 at ums0 mux 0
ipmi0: version 1.5 interface KCS iobase 0xca8/8 spacing 4
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
>Description:
I'm running NetBSD/amd64-current on a Dell 2850 with an AMI PERC 4e/Di
RAID controller with 256MB of RAM, and three logical volumes, each
comprising two mirrored disks.  The above dmesg output shows three
network interfaces, but only wm0 is in use.  It also shows a degraded
logical disk (ld2), but the problem was present before that happened.
That particular volume is, in any case, mostly idle.

When enabling background writes in the cache configuration, heavy disk
activity leads to random network hangups, where all network activity
on the system will pause for anything from a few seconds to a couple
of minutes.  A moderate load (say, running a "-j 1" NetBSD build job)
will give less frequent, shorter hangs.  A heavier load (the same
build, "-j 4") will give more frequent hangups of varying, but
generally longer lengths.  The problem is sufficiently bad that the
system is, in practice, not acceptable for interactive work during a
"-j 4" build.
>How-To-Repeat:
Enable write-back cache, give system enough work, observe hangups.
>Fix:
Disabling write-back cache seems to be a reliable workaround.



Home | Main Index | Thread Index | Old Index