tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Lockups and other trouble using large disks, raid(4), cgd(4), dk(4)



Hi there,

I'm trying to build me a nice little storage server for home use using
netbsd-5 (NetBSD 5.0_BETA). The system shall eventually replace an
existing server using a very similar setup (different hardware, 5x320GB
disks, RAID5 on wd[0-4], cgd on raid0, NetBSD 4.0) which is working perfectly
atm.

Hardware consists of a Tyan Toledo i3100 S5207G2N, an Intel T7400 mobile
CPU, 4G ECC RAM and 5x1.5TB ST31500341AS Seagate disks.

The system seems to be rock solid so far. Memtest86+ ran for a while, no
problems.

The task looks simple. RAID together 5 disks and protect cold disks
against inspection. 

After evaluating my options a little, I first came to the conclusion
that I could disklabel the individual disks but I would have to use GPT
for the resulting raid(4) device, since that comes out at around 5.7TB.
I decided to go GPT all the way instead and do away with disklabels
altogether. Booting is another matter and not part of the picture right
now.

So here's what I did:

1. create a GPT on each disk with 1 partition of type raidframe (=dk[0-4])
2. create a RAID5 raid0 using dk[0-4] as components
3. create a GPT on raid0 with 1 partition of type cgd (=dk5)
4. create a cgd device cgd0 on dk5

   Here's the first problem. cgd(4) can't use dk(4) devices as a parent.
   There is PR kern/38735 describing the problem and offering a patch.
   Patch seems to work OK, so I went on.

5. create a GPT on cgd0 with several partitions of type ffs.

   Next problem: gpt(8) can't use cgd(4) as a device. I believe the
   error was "No such process" IIRC. I would have to wire the new
   system up again since I'm lacking a power supply at the moment.

OK, that went nowhere. I decided to try without cgd. I created an FFS
partition covering the whole raid0 using newfs -O 2 -b 65536 /dev/dk5
after changing the partition type of dk5 from cgd to ffs. The 64K
block size is necessary to match the 32 sectors stripe size of the RAID5.
Other values lead to abysmal write performance.

Mounting dk5 gives me a nice litt^Wbig partition of over 5TB. Extracting
a .tar containing a NetBSD source tree leads to a not so nice lockup quite
fast. The system just hangs, DDB hotkey doesn't work, capslock/numlock
LEDs don't change etc.[1]

OK, so I thought maybe there's problems with partitions of such size.  
Creating a partition of exactly 1TB results in the same lockup however.
Now this is strange, since I'm using a 1.1TB cgd partition on a RAID5
in my existing server and this new partition isn't even using cgd yet.

Thus, I tried using one disk in the most normal way. Disklabel it, create
1 partition of 1.3T size and start using it. As expected, there's no
problem. I can extract the source tree and start building kernels and
releases.

Next day, new tests. Now I tried to setup the whole system kinda dif-
ferently: use cgd[0-5] on the individual disks and create a raid0 using
cgd[0-5] as components. [2]

1a. use wd[0-4]d directly for cgd[0-4]

    Doesn't work. I don't remember the error. Instead:

1b. manually create dk[0-4] covering the whole disks and create cgd[0-4]
    on dk[0-4]. It works. Write performance is at ~40MB/s. Looks OK.
2.  create raid0 using cgd[0-4] as components. Works OK. Raw write
    performance is a little over 40MB/s. As expected, there's no
    penalty for practically doing en-/decryption five times on individual
    componentes as opposed to once on raid0.
3.  create a GPT on raid0 and 1 partition covering the whole RAID (=dk5).

Writing to dk5 using dd if=/dev/zero of=/dev/rdk5 bs=65536 instantly
locks up the system. Reducing partition size to 1TB doesn't help.

So I'm asking for help on the matter. There is clearly something wrong
when using this particular mix of dk(4), raid(4) and cgd(4).

Since there's no pressure to get the system in place, I'm willing to
test any patches and help with debugging. After all, I want to be sure I
can trust this setup with my data.

Thanks.

Best regards.


[1] Seagate has acknowledged a problem using a particular firmware on
    these drives.
    
http://forums.seagate.com/stx/board/message?board.id=ata_drives&thread.id=2879
    However, I don't think this is related to the lockups I'm experiencing.
[2] I believe using this order of cgd(4) and raid(4) is even better, since
    it leaves nothing but random data on the individual disks. cgd-on-raid
    has at lease the raidframe headers on the components, possibly a GPT
    as well. Also, trying to get cleartext from a raid-on-cgd would require
    four times the effort of a cgd-on-raid, correct?


-- 
Of course it runs NetBSD.


Home | Main Index | Thread Index | Old Index