-----Original Message-----
From: Hisashi T Fujinaka [mailto:htodd%twofifty.com@localhost]
Sent: Saturday, August 30, 2014 21:29
To: Terry Moore
Cc: tech-kern%netbsd.org@localhost
Subject: Re: FW: ixg(4) performances
Doesn't anyone read my posts or, more important, the PCIe spec?
2.5 Giga TRANSFERS per second.
I'm not sure I understand what you're saying.
From the PCIe space, page 40:
"Signaling rate - Once initialized, each Link must only operate at one of
the supported signaling
levels. For the first generation of PCI Express technology, there is only
one signaling rate
defined, which provides an effective 2.5 Gigabits/second/Lane/direction of
raw bandwidth.
The second generation provides an effective 5.0
Gigabits/second/Lane/direction of raw
bandwidth. The third generation provides an effective 8.0
Gigabits/second/Lane/direction of
10 raw bandwidth. The data rate is expected to increase with technology
advances in the future."
This is not 2.5G Transfers per second. PCIe talks about transactions rather
than transfers; one transaction requires either 12 bytes (for 32-bit
systems) or 16 bytes (for 64-bit systems) of overhead at the transaction
layer, plus 7 bytes at the link layer.
The maximum number of transactions per second paradoxically transfers the
fewest number of bytes; a 4K write takes 16+4096+5+2 byte times, and so only
about 60,000 such transactions are possible per second (moving about
248,000,000 bytes). [Real systems don't see this, quite -- Wikipedia claims,
for example 95% efficiency is typical for storage controllers.]
A 4-byte write takes 16+4+5+2 byte times, and so roughly 9 million
transactions are possible per second, but those 9 million transactions can
only move 36 million bytes.
Multiple lanes scale things fairly linearly. But there has to be one byte
per lane; a x8 configuration says that physical transfers are padded so that
each the 4-byte write (which takes 27 bytes on the bus) will have to take 32
bytes. Instead of getting 72 million transactions per second, you get 62.5
million transactions/second, so it doesn't scale as nicely.
Reads are harder to analyze, because they depend on the speed and design of
both ends of the link. The reader sends a read request packet, and the
read-responder (some time later) sends back the response.
As far as I can see, even at gen3 with lots of lanes, PCIe doesn't scale to
2.5 G transfers per second.
Best regards,
--Terry