NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD Raid5, slow write speeds, using big disks?!



Thanks for the replys,

There were clear errors in there, but that together with Stephens response about aligning with with 4k made about a 4x write speed increase. I did not think it would be needed to mention that im NOT a fan of ZFS, if i wanted that i would have kept freebsd on it.

If someone has done a raid5 on netbsd with big disks, please let me know.
br Nicklas

On 6/21/26 10:58 PM, Aryabhata wrote:
Hello,

To answer your most important question first: Are your expectations wrong? Yes, slightly.

When you tested ZFS on FreeBSD using dd if=/dev/zero, you weren't actually testing your hard drives. You were testing your RAM and CPU.

 * ZFS is a highly intelligent, Copy-on-Write filesystem. By default, it uses compression. When you feed it a stream of zeroes, ZFS compresses that data into virtually nothing and caches it in RAM (the ARC).

 * The 4.03 GB/sec write speed you saw is physically impossible for four Seagate IronWolf NAS drives. Those drives max out at around 210–250 MB/s each. Even in a perfect RAID0 (no parity), the theoretical maximum physical write speed would be around 1 GB/s.

What should you expect?On a perfectly tuned 4-disk RAID5 array of modern 7TB spinning drives, you should expect sustained sequential write speeds somewhere between 300 MB/s and 500 MB/s.

Your NetBSD speeds are genuinely terrible, and there are two main culprits working together to sabotage your performance: Stripe Misalignmentand the Read-Modify-Write (RMW) penalty.

When you configure RAID5, data is striped across the disks in chunks, with one block reserved for parity. If you write a file that is smaller than the stripe size, or misaligned with the stripe, the RAID controller (or RAIDframe in this case) cannot just write the data. It has to:
 1. Read the old data and the old parity from the disks.
 2. Modify the data in memory.
 3. Write the new data and the new parity back to the disks.

This turns one write operation into four separate I/O operations.

In your /etc/raid5.conf, you defined:
sectPerSU 32
RAIDframe calculates this in 512-byte sectors.
32 \times 512 \text{ bytes} = 16\text{ KB} per disk.
Because you have 4 disks in RAID5, 3 are used for data and 1 for parity.
Your total data stripe width is:
3 \text{ data disks} \times 16\text{ KB} = 48\text{ KB stripe width}
However, when you created your filesystem, you used:
newfs -b 65536 (which dictates a 64 KB block size).

Because your filesystem blocks (64 KB) don't match your RAID stripes (48 KB), every single write you make overlaps multiple stripes.  This guarantees a massive Read-Modify-Write penalty for every single block of data. The initial 1.6 GB/s burst you saw was NetBSD writing to your system RAM (buffer cache). The 1.3 MB/s drop happened when the RAM filled up and the system desperately tried to flush those misaligned blocks to the physical disks.

To fix this, we need to ensure your filesystem block size fits neatly into your RAID stripe units. Since you have 4 disks (3 data), it is mathematically impossible to make the total stripe width a perfect power-of-two (like 64KB) because 3 is not a power-of-two. The best workaround is to make the *individual disk stripe unit* match your filesystem block size.
Change your /etc/raid5.conf to use
128 sectors per SU:

START array
1 4 0

START disks
/dev/dk3
/dev/dk4
/dev/dk5
/dev/dk6

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
128 1 1 5

START queue
fifo 256

 * Why 128? 128 \times 512 \text{ bytes} = 64\text{ KB}.
 * Now, a 64 KB filesystem block from FFS will fit perfectly onto a *single* disk's chunk. RAIDframe will only need to calculate parity against that specific chunk, rather than spanning multiple chunks and fragmenting the I/O.  * I also bumped your queue size to 256 to give the disks a bit more breathing room for optimization.

You enabled WAPBL using tunefs -l 1g. WAPBL is great for crash consistency, but it works by writing small, synchronous metadata updates to the disk. On RAID5, small synchronous writes trigger the Read-Modify-Write penalty constantly.

Once you rebuild the RAID with sectPerSU 128, test the write speeds before enabling the WAPBL log. You'll likely see a massive improvement. If enabling the log slows things down again, consider placing the WAPBL log on your SSD instead of the RAID array.


You aren't crazy, and NetBSD isn't inherently 200x slower. ZFS was just tricking you with RAM speeds, and your NetBSD array was mathematically fighting against itself due to a stripe-size mismatch. And yes, the raidctl man page absolutely needs a modern "large disk" section!

Regards,

Arya

On Sat, 20 Jun, 2026, 10:26 pm smurfd, <smurf.daemon%mail.smurfd.me@localhost> wrote:

    Hey,

    This will be a long one, sorry for this.
    I have created a RAID5 NAS using NetBSD 10.1, and im experiencing
    slow
    writing speeds. Or is my expectations all wrong? :)
    I earlier had FreeBSD and ZFS zraid on the same Ugreen DXP4800 plus
    (https://nas-eu.ugreen.com/products/ugreen-nasync-dxp4800-plus-nas-storage)
    The NAS drives are 4x7TB, less than a year old Segate Ironwolf NAS
    3.5".
    NetBSD is installed first on the SSD, using a base install. Then the
    Raid is created manually...

    Have asked here:
    https://www.unitedbsd.com/d/1646-raid5-slow-write-on-big-disks/8
    Just now redid the raid, for the Xth time, to compare the speeds i
    got
    using FreeBSD (because i didnt remember/had that written down)
    I know, comparing zraid vs raid5 is like comparing apples to oranges,
    right?...

    What write speeds should i expect?! is probably the most important
    question...
    Or where did i go wrong, because user error is VERY likely.

    First, i tried to follow the Summary section in the
    https://man.netbsd.org/raidctl.8, realizing that Disklabel
    supports no
    larger than 2TB disks.
    So found this:
    https://wiki.netbsd.org/users/mlelstv/using-large-disks/
    and tried to follow the Raidframe section.
    Then i got the correct size i expected...

    The speed though, is slow! (again compared to zfs zraid)
    When i had FreeBSD installed, i did the following. Installed the
    system
    to the SSD.
    Then, did this, where **** is repeaded 4 times for the disks...
    # gpart destroy -F ada0 ****
    # gpart create -s gpt ada0 ****
    # gpart add -t freebsd-zfs ada0 ****
    # zpool create -f island raidz /dev/ada0p1 /dev/ada1p1 /dev/ada2p1
    /dev/ada3p1 (this was instant)
    island/scumm           21T    140K     21T     0% /zstorage/scumm

    This is the speed...
    # dd if=/dev/zero bs=1024k count=1000 of=/zstorage/scumm/test.txt
    1048576000 bytes transferred in 0.259733 secs (4037123661 bytes/sec)
    # dd if=/dev/zero bs=4096k count=1000 of=/zstorage/scumm/test.txt
    4194304000 bytes transferred in 1.534966 secs (2732505282 bytes/sec)
    # dd if=/dev/zero bs=1024 count=1000 of=/zstorage/scumm/test.txt
    1024000 bytes transferred in 0.004667 secs (219403356 bytes/sec)

    Then on NetBSD, installed on the SSD i did this:

    # gpt destroy wd0 ****
    # gpt create -Af wd0 ****
    # gpt add -t raid -l raid5@wd0 -b $(( 2048 )) -s 15628051053 wd0 ****

    # raidctl -C /etc/raid5.conf raid5
    # raidctl -I 13371337 raid5
    # raidctl -iv raid5 (This took around 72+ hours)
    # gpt create -Af raid5
    # gpt add -a 1024 -t ffs -l island raid5
    /dev/rraid5: Partition 1 added:
    49f48d5a-b10e-11dc-b99b-0019d1879648 34
    46884152860
    # dkctl raid5 addwedge island 34 46884152860 ffs
    # newfs -O2 -b 65536 -s -1g /dev/dk7 (trying to enable
    https://man.netbsd.org/wapbl.4)
    # tunefs -l 1g /dev/dk7
    # mount -o log /dev/dk7 /mnt
    # umount /mnt
    # mount /dev/dk7 /mnt/island/
    /dev/dk7        22T   8.0K    21T   0% /mnt/island

    from /etc/fstab
    /dev/dk7        /mnt/island     ffs     rw,noatime,log   1 1

    This one, was done before initializing the raid, on one disk
    nas1# dd if=/dev/zero bs=1024k count=1000 of=/mnt/test.txt
    1048576000 bytes transferred in 2.201 secs (476408905 bytes/sec)

    Writing to the raid:
    nas1# dd if=/dev/zero bs=1024k count=1000 of=/mnt/island/test.txt
    1048576000 bytes transferred in 68.984 secs (15200278 bytes/sec)
    nas1# dd if=/dev/zero bs=4096k count=1000 of=/mnt/island/test.txt
    4194304000 bytes transferred in 478.951 secs (8757271 bytes/sec)
    nas1# dd if=/dev/zero bs=1024 count=1000 of=/mnt/island/test.txt
    1024000 bytes transferred in 0.054 secs (18962962 bytes/sec)

    If i tried bs=512k and count=1000, i get around 1.6Billion bytes
    per sek
    write, woohoo...
    Then try the same with count=2000 and get 1.3 million bytes per
    sek write
    hmmm ....

    This is /etc/raid5.conf
    START array
    # numRow numCol numSpare
    1 4 0

    START disks
    /dev/dk3
    /dev/dk4
    /dev/dk5
    /dev/dk6

    START layout
    # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
    32 1 1 5

    START queue
    fifo 100


    Summary:
    It seems like this is 200 times slower than FreeBSD zraid, is that
    reasonable? or where did i go wrong?!
    Again, only curious. Was thinking about installing OpenBSD, they
    have a
    more similar raid system compared to netbsd(more or less exactly
    the same:))

    Could it be worth adding a Big disk section to the raidctl manpage:
    https://man.netbsd.org/raidctl.8

    Thanks in advance!
    Br Nicklas




Home | Main Index | Thread Index | Old Index