Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: mfi(4) locking bug?



On Jul 12,  5:17am, Manuel Bouyer wrote:
} On Sun, Feb 13, 2011 at 12:09:49AM -0800, John Nemeth wrote:
} >      I'm working on a multicore machine with an mfi(4) in it.  If I try
} > to boot the installation CD normally, it typically hangs when running
} > newfs.  If I 'boot -12' (i.e. no SMP {or ACPI}) I can install the
} > system.  However if I try to perform heavy disk operations (i.e.
} > pkgsrc checkout) on a system booted normally, it will hang at some
} > point.  Right now, the system is running with a drive connected to a
} > different controller where it has been doing things like checking out
} > and building packages, running build -j 30, etc.  for a week without
} > any issues.  Only difference between system that works and system that
} > doesn't is the disk controller in use.  Details on the mfi(4)
} > controller are:
} > 
} > # pcictl /dev/pci1 list
} > 004:00:0: Symbios Logic SAS1078 PCI (RAID mass storage, revision 0x04)
} > # grep mfi /var/run/dmesg.boot
} > mfi0 at pci1 dev 0 function 0
} > mfi0: interrupting at ioapic1 pin 20
} > mfi0: logical drives 1, version 8.0.1-0038, 256MB RAM
} > scsibus0 at mfi0: 64 targets, 8 luns per target
} > 
} > Is anybody aware of locking issues in mfi(4) or have any suggestions
} > for debugging this problem?
} 
} which version of NetBSD is it ?
} I have some SMP (4 or 8 cores) Dell server with MFI controllers,
} no issues so far with a 5.0_STABLE or 5.1_STABLE.
} There was some issues in 5.0 release which have been fixed since then.

     Started with 5.1, moved to -current build, from nyftp:

NetBSD 5.99.45 (GENERIC) #0: Sun Feb  6 02:33:30 UTC 2011

Currently has:

NetBSD fibrenetvm 5.99.45 NetBSD 5.99.45 (GENERIC) #0: Sun Feb 13 23:15:35 PST 
2011  jnemeth@fibrenetvm:/usr/src/sys/arch/amd64/compile/GENERIC amd64

Not currently running off the mfi, but a simple 'disklabel sd0' results
in the process hanging somewhere inside the driver.

     Another potential possibility that came to mind is that the issue
is PCIe related.  The relevant devices are:

pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: ATI Technologies RD890 North Bridge Dual Slot 
2x16 GFX (rev. 0x02)
ppb0 at pci0 dev 4 function 0: ATI Technologies RD890 PCI Express Bridge GPP 
Port D (rev. 0x00)
ppb0: PCI Express 2.0 <Root Port of PCI-E Root Complex>
pci1 at ppb0 bus 4
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
mfi0 at pci1 dev 0 function 0
mfi0: interrupting at ioapic1 pin 20
mfi0: logical drives 1, version 8.0.1-0038, 256MB RAM
scsibus0 at mfi0: 64 targets, 8 luns per target

     Hrmm, I also see this:

ioapic0 at mainbus0 apid 28: pa 0xfec00000, version 21, 24 pins
ioapic0: can't remap to apid 28
ioapic1 at mainbus0 apid 29: pa 0xfec20000, version 21, 32 pins
ioapic1: can't remap to apid 29

     Don't know if that's an issue or not, but the ethernet controllers
and ixpide are working fine.  Actually come to think of it, did have
some trouble with one of the ixpides:

ixpide0 at pci0 dev 17 function 0: ATI Technologies IXP IDE Controller (rev. 
0x00)
ixpide0: bus-master DMA support present
ixpide0: primary channel configured to native-PCI mode
ixpide0: using ioapic0 pin 22 for native-PCI interrupt
atabus0 at ixpide0 channel 0
ixpide0: secondary channel configured to native-PCI mode
atabus1 at ixpide0 channel 1

Currently running with an SSD attached to atabus2:

ixpide1 at pci0 dev 20 function 1: ATI Technologies IXP IDE Controller (rev. 
0x00)
ixpide1: bus-master DMA support present
ixpide1: primary channel configured to compatibility mode
ixpide1: primary channel interrupting at ioapic0 pin 14
atabus2 at ixpide1 channel 0
ixpide1: secondary channel configured to compatibility mode
ixpide1: secondary channel interrupting at ioapic0 pin 15
atabus3 at ixpide1 channel 1

}-- End of excerpt from Manuel Bouyer


Home | Main Index | Thread Index | Old Index