NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Prepping to install



On 13 May 2015 at 16:03, William A. Mahaffey III <wam%hiwaay.net@localhost> wrote:
> On 05/13/15 08:48, David Brownlee wrote:
>>
>> On 12 May 2015 at 16:01, William A. Mahaffey III <wam%hiwaay.net@localhost> wrote:
>>>
>>> On 05/12/15 02:32, David Brownlee wrote:
>>>>
>>>> On 11 May 2015 at 23:46, William A. Mahaffey III <wam%hiwaay.net@localhost> wrote:
>>>>
>>>> If you are using RAID5 I would strongly recommend keeping to
>>>> "power-of-two + 1" components, to keep the stripe size as a nice power
>>>> of two, otherwise performance is... significantly impaired.
>>>
>>> Hmmmm .... Could you amplify on that point a bit ? I am intending to
>>> maximize available storage & have already procured the mbd & 6 drives,
>>> but I
>>> could rethink things if my possibly hasty choices would be too burdensome
>>
>> For RAID5 to perform efficiently data should be written in units which
>> re aligned with the RAID stripes and are a multiple of stripe in size,
>> otherwise a simple write changes into a read of  stripe, modification
>> of the affected part and then a write.
>>
>> Filesystems tend to have sectors and blocks which are powers of two,
>> so the easiest way to arrange this for ffs is for the filesystem block
>> size to be a multiple of the stripe size ("1" is a fine multiple in
>> this case).
>>
>> This is similar to the issue with sector drives which have 4K sectors
>> but present them 512byte sectors - if a filesystem is not 4K aligned
>> then write performance suffers horribly.
>
> Hmmmmm .... OK, I think I have it. The stripe size is N * <some underlying
> (disk|RAID) block size>, & if N (the # of active drives) is odd or prime (or
> both, as in my case), we would have/need bizarre filesystem block sizes (for
> alignment w/ RAID stripes) or unaligned FS blocks/sectors, which give crappy
> performance, right ? Could you estimate how crappy crappy really is ? 25%
> slower ? 50%, 100%, more ? Me scots sensibilities hate having almost 1 TiB
> of drive sitting around idle (although I do crave speed enough to override)
> :-/ ....

If you manage 25% of the performance (that is "only" 75% hit) I would
be surprised. I'd also be curious to see what number you do get :) -
I'm quite fond of pkgsrc/benchmarks/bonnie++ to get simple comparable
numbers. If you are testing, some things to vary
- Number of drives (5, 6)
- Stripe size, eg 4K per drive or 8K per drive
- Filesystem block size 32K, 64K (may not be able to use 64K for boot
partitions)
- mounting with '-o log' or not (generally you want this :)
Remember to ensure you have good (at least 4K) alignment on the base
partitions. If you have a modern '4K under the covers' drive and start
at sector 63... its not a good place to be

>>>> If you want to maximise space with some redundancy then as you say,
>>>> RAID5 is the way to go for the bulk of the storage.
>>>>
>>>> A while back I setup a machine with 5 * 2TB disks with netbsd-6, with
>>>> small RAID1 partitions for root and the bulk as RAID5
>>>>
>>>> http://abs0d.blogspot.co.uk/2011/08/setting-up-8tb-netbsd-file-server.html
>>>> (wow, was that really four years ago) - in your position I might keep
>>>> one 1TB as a scratch/build space and then RAID up the rest.
>>>>
>>>> If you have time definitely experiment, get a feel for the different
>>>> performance available from the different options.
>>>
>>> *Wow*, another fabulous resource. Your blog documents almost verbatim
>>> what I
>>> have in mind. I am going w/ 6 drives (already procured, 6 SATA3 slots on
>>> the
>>> mbd, done deal), but philosophically very close to what you describe. 1
>>> question: if you were doing this again today, would it be fdisk or GPT ?
>>
>> If I had >2TB drives it would be TB :) If not, I would still stick
>> with fdisk. The complexity of
>> gpt setup and wedge autoconfiguration is still greater than fdisk and
>> disklabel. I know I'm going to have to move to it at some point, but
>> I'm going to hold off until I need to
>>
>>> I think I am looking at 4 partitions per drive, ~16 GB for / (RAID1, 2
>>> drives)
>>> & /usr (4 drives, RAID10), 16 GB for swap (kernel driver, all 6 drives),
>>> 16
>>> - 32 GB for /var (RAID5, all 6 drives), & the rest for /home (RAID5, all
>>> 6
>>> drives). TIA & thanks again.
>>
>> I would definitely hold off on RAID5 for everything except the large
>> /home. RAID1 is much simpler and more performance for writes. I would
>> also try to avoid configuring multiple RAID5s across overlapping sets
>> of disks, while it theoretically provides more IO bandwidth, that
>> bandwidth will be having to compete with all the other filesystems and
>> swap usage on the system.
>>
>> If you wanted to use all six disks:
>> - 32G(RAID1 root+usr) 910G(non raid scratch space)
>> - 32G(RAID1 root+usr) 910G(RAID5 home)
>> - 32G(RAID1 var) 910G(RAID5 home)
>> - 32G(RAID1 var) 910G(RAID5 home)
>> - 32G(RAID1 swap+spare) 910G(RAID5 home)
>> - 32G(RAID1 swap+spare) 910G(RAID5 home)
>>
>> 32GB space notes:
>> - This gives you three 32GB RAID1 'pools' to allocate everything
>> outside of /home
>> - Can adjust the 32G up or down before partitioning, but all should be the
>> same
>> - In the suggestion, root+usr are kept on the same RAID (and could be
>> a single partition), so that the system can have all of the userland
>> available with only one disk attached, and a 'spare' partition is left
>> in case of later moderate additional space needs - maybe an extra
>> partition for /usr/pkg?, or for /var/pgsql, etc
>> - Obviously allocate usage within pools to taste - could put /usr on a
>> separate raid to provide more IO bandwidth for root & usr
>
> This is interesting. I kinda wanted swap spread out over all 6 drives for
> better swap I/O performance, an issue I am having with another box which is
> laid out sorta like this, with swap 'on top of' a RAID0 block (admittedly
> under Linux, not *BSD, but still), swap performance is horrible, several
> min. to page in 200-300 MB worth of paged out VM. I was planning on as much
> parallelization of each RAID as possible for max performance, & swap handled
> by the kernel driver. Others have suggested swap on a RAID 'partition', is
> that more de-rigeur for NetBSD, or the other BSD's for that matter ? This
> box, under FreeBSD 9.3R-p13, has 4 swap partitions under straight kernel
> management, & seems very spry, although it also has a lot of RAM & doesn't
> swap much (on purpose, BTW) ....

Separate swap devices will give the best performance, RAID1 or 5 will
give robustness in the face of a single component failure. You pays
your money... Of course, if you have dedicated partitions on the disk
which you could RAID then you can even change your mind after install,
swapctl off the swap, mess with the partitions and away you go (nerves
of steve advised, though not required :)

>> 910GB space notes:
>> - This gives 5* 910GB RAID5, which provides 4*910G (or 3640G) of space
>> - One disk is not included in the RAID5. This could be saved as a
>> spare for a RAID5 component failure (though a better approach might be
>> to have a disk on the desk next to the machine :), or used as non
>> raided scratch space. If it will not be active, then probably best to
>> put it on one of the components for the heaviest used 32G, or the most
>> important 32G
>
> Your assessments are quite persuasive, I think I now like the 5 drive RAID5
> for home, with that last partition as nebulous scratch space.
>
>>
>> Note in the above that IO to /home will hit (almost) all disks, and
>> will affect all of the 32GB pools, so if you have heavy IO to /home do
>> not expect blistering performance from any filesystem. On the other
>> hand when /home has very light IO then you should have relatively nice
>> multi spindle performance from the other filesystems.
>
> Yeah, but it would speed up access to *just* /home, right ? This box will be
> backing up other boxen on my LAN, initially to a directory under /home, so I
> want that I/O to be as swift as possible. I am maxing out the RAM (also
> already procured), so I hope I don't have too much contention between I/O to
> /home & swap ....

If you want home to be as fast a possible, then you really want to
prefer RAID1 to RAID5 (which conflicts with the space.. I know). I run
dirvish overnight from some machines to a RAID5 setup pretty much
identical to the one in my post, and it works well enough.

>> Having said all that, if I had the time to play I would install onto a
>> USB key, then script up the building and partitioning of the system in
>> many different forms and then chroot into the result and run some
>> tests to see how it performs.
>
> I *definitely* want to script the partitioning both for repeatability in the
> event of drive failure (or setting up another box) & to avoid fat-fingered
> screw-ups !!!!
>
> Thanks again for a fabulously informative reply.

Having discussed all this RAID5 goodness I feel obliged to comment
that when I finally run out of space and need a new build I'm probably
going to go for 4TB disks with six of them in RAID1 pairs for 12TB
with flexibility for adding more space (in 4TB units). If I *needed*
to get 16TB out of them I'd go the RAID5 route again, but I'm willing
to trade off the extra space for speed and simplicity. Of course I'm
really holding off having to actually *buy* six 4TB disks for as long
as humanly possible (by which point they may be 6TB disks, but who can
tell :)


Home | Main Index | Thread Index | Old Index