tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

RE: ixg(4) performances



 
> -----Original Message-----
> From: tech-kern-owner%NetBSD.org@localhost 
> [mailto:tech-kern-owner%NetBSD.org@localhost] On
> Behalf Of Emmanuel Dreyfus
> Sent: Thursday, August 28, 2014 23:55
> To: Terry Moore; 'Christos Zoulas'
> Cc: tech-kern%netbsd.org@localhost
> Subject: Re: ixg(4) performances
> 
> Terry Moore <tmm%mcci.com@localhost> wrote:
> 
> > There are several possibilities, all revolving about differences
> > between the blog poster's base system and yorus.
> 
> Do I have a way to investigate for appropriate PCI setup? Here is what
> dmesg says about it:
> 
> pci0 at mainbus0 bus 0: configuration mode 1
> pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
> ppb4 at pci0 dev 14 function 0: vendor 0x10de product 0x005d (rev. 0xa3)
> ppb4: PCI Express 1.0 <Root Port of PCI-E Root Complex>
> pci5 at ppb4 bus 5
> pci5: i/o space, memory space enabled, rd/line, wr/inv ok
> ixg1 at pci5 dev 0 function 1: Intel(R) PRO/10GbE PCI-Express Network
> Driver, Version - 2.3.10

I don't do PCIe on NetBSD -- these days we use it exclusively as VM guests
--, so I don't know what tools are available. Normally when doing this kind
of thing I poke around with a debugger or equivalent of pcictl. 

The dmesg output tells us that your ixg is directly connected to an Nvdia
root complex. So there are no bridges, but this might be a relevant
difference to the benchmark system. It's more common to be connected to an
Intel southbridge chip of some kind.

Next step would be to check the documentation on, and the configuration of,
the root complex -- it must also be configured for 4K read ahead (because
the read will launch from the ixg, will be buffered in the root complex,
forwarded to the memory controller, and then the answers will come back.

(PCIe is very fast at the bit transfer level, but pretty slow in terms of
read transfers per second. Read transfer latency is on the order of 2.5
microseconds / operation. This is why 4K transfers are so important in this
application.)

Anyway, there are multiple vendors involved (Intel, Nvidia, your BIOS maker,
because the BIOS is responsible for setting things like the read size to the
maximum across the bus -- I'm speaking loosely, but basically config
software has to set things up because the individual devices don't have
enough knowledge). So generally that may explain things. 

Still, you should check whether you have the right number of the right
generation of PCIe lanes connected to the ixg. If you look at the manual,
normally there's an obscure register that tells you how many lanes are
connected, and what generation. On the motherboards we use, each slot is
different, and it's not always obvious how the slots differ. Rather than
depending on documentation and the good intentions of the motherboard
developers, I always feel better looking at what the problem chip in
question thinks about number of lanes and speeds.

Hope this helps,
--Terry




Home | Main Index | Thread Index | Old Index