tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Where is the component queue depth actually used in the raidframe system?
On Sat, 16 Mar 2013 21:56:43 -0700
Brian Buhrow <buhrow%nfbcal.org@localhost> wrote:
> On Mar 14, 8:47am, Greg Oster wrote:
> } Subject: Re: Where is the component queue depth actually used in
> the raidf } On Thu, 14 Mar 2013 10:32:26 -0400
> } Thor Lancelot Simon <tls%panix.com@localhost> wrote:
> }
> } > On Wed, Mar 13, 2013 at 09:36:07PM -0400, Thor Lancelot Simon
> wrote: } > > On Wed, Mar 13, 2013 at 03:32:02PM -0700, Brian Buhrow
> wrote: } > > > hello. What I'm seeing is that the
> underlying disks } > > > under both a raid1 set and a raid5 set are
> not seeing anymore } > > > than 8 active requests at once across the
> entire bus of disks. } > > > This leaves a lot of disk bandwidth
> unused, not to mention less } > > > than stellar disk performance. I
> see that RAIDOUTSTANDING is } > > > defined as 6 if not otherwise
> defined, and this suggests that } > > > this is the limiting factor,
> rather than the actual number of } > > > requests allowed to be sent
> to a component's queue. } > >
> } > > It should be the sum of the number of openings on the underlying
> } > > components, divided by the number of data disks in the set.
> Well, } > > roughly. Getting it just right is a little harder than
> that, but I } > > think it's obvious how.
> } >
> } > Actually, I think the simplest correct answer is that it should
> be the } > minimum number of openings presented by any individual
> underlying } > component. I cannot see any good reason why it should
> be either more } > nor less than that value.
> }
> } Consider the case when a read spans two stripes... Unfortunately,
> each } of those reads will be done independently, requiring two IOs
> for a given } disk, even though there is only one request.
> }
> } The reason '6' was picked back in the day was that it seemed to
> offer } reasonable performance while not requiring a huge amount of
> memory to } be reserved for the kernel. And part of the issue there
> was that } RAIDframe had no way to stop new requests from coming in
> and consuming } all kernel resources :( '6' is probably a reasonable
> hack for older } machines, but if we can come up with something
> self-tuning I'm all for } it... (Having this self-tuning is going to
> be even more critical when } MAXPHYS gets sent to the bitbucket and
> the amount of memory needed for } a given IO increases...)
> }
> } Later...
> }
> } Greg Oster
>
> Hello. If I understand Thor's formula right, then a raid set
> I have (raid5) with 4 components, each on a wd(ata) disk, then the
> correct number of outstanding requests should be limited to 4 because
> it looks like our ata drivers only present 1 opening per channel.
> However, increasing the outstanding requests on this box from 6,
> which is already too high according to the formula as I understand
> it, to 20, increases the disk throughput on this machine by almost
> 50% for many of the work loads I put on it.
Yum! :)
> I imagine there is a
> point of diminishing returns in terms of how much of a queue I should
> allow on the outstanding requests limit,
Yes...
> but right now, it's unclear
> to me how to figure out what the optimal setting is for this number
> based on any underlying capacity indicators there may be. It seems
> like a better huristic might be to be able to specify a maximum
> amount of memory the raidframe driver would be allowed to use, and
> then have it set the outstanding request count accordingly.
I think that is the preferred approach. At least, that is where the
'6' number came from back in the day...
> IN the
> case of the machine I refer to above, I have 2 raid sets, the stripe
> size is set to 64 blocks (32K) with 4 stripes per raid set. with one
> of the raid sets running in degraded mode, the maximum amount of
> memory used by the raidframe subsystem is 10.4MB. That's not an
> insignificant amount of memory, but it's certainly not a profligate
> amount. Further thoughts?
10MB is reasonable today, but not so much on a 32MB or 64MB machine :)
I'm not sure what the magic number should be... whether we say 5% of
kernel memory per RAID set, and then scale that by the size of the RAID
set to produce the number of openings (minimum remains at 6?).
Another option to self-tuning would be to introduce a sysctl to allow
setting the value on-the-fly...
According to my notes I was attempting to do memory calculations on
this back in 2003/2004, but it doesn't look like I came up with a firm
formula back then either... According to those notes, the number of
nodes in the IO graph is bounded by:
(2 * raidPtr->Layout->numDataCol) + (1 * layoutPtr->numParityCol)
+ (1 * 2 * layoutPtr->numParityCol) + 3
Multiplying that by the stripe width we get a bound on the memory
requirements for the data -- I think it overestimates the requirement
per IO, but that's fine. For a 5-disk RAID 5 set with a stripe
width of 32 (16K/component, 64K data for the entire stripe) what we end
up with is a memory requirement of:
(2*4+1*1+1*2*1+3)*16K=224K per IO.
It's just a matter of scaling the number of openings to match some
reasonable use of kernel memory...
Later...
Greg Oster
Home |
Main Index |
Thread Index |
Old Index