Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: raidframe on > 2TB disks, revisited



Two, some part of newfs seems to be broken or my understanding of it
is broken:

# newfs -O2 -b 65536 -f 8192 -F -s 7772092288 /dev/rraid0a
/dev/rraid0a: 3794967.0MB (7772092288 sectors) block size 65536,
fragment size 8192
        using 1160 cylinder groups of 3271.56MB, 52345 blks, 103936 inodes.
wtfs: write error for sector 7772092287: Invalid argument

As I understand it, the basic problem is that disklabels can only
represent 2T.  So you can't use disklabels for big disks at all.

Yes - disklabels represent at most 2^32 sectors.


1) Create gpt wedges on both drives for a small ffs filesystem, swap,
and a large RAID

Can raid autoconfig from gpt?   I wonder about two raids, one moderate
for root/swap and whatever else you want, and one very large, each in
gpt.  Use disklabel in the small raid and gpt in the large one.

raidframe can autoconfigure on gpt. My raid device uses:

Components:
            /dev/dk2: optimal
            /dev/dk5: optimal


newfs -F -s (the number of sectors in raid0) -O2 /dev/rraid0d

Do you really need to give -s, when I'd expect newfs to figure out the
size of rraid0d?

/dev/raid0 shows 2^32-1 sectors, whereas the raid device itself is 7772092288 sectors in size. Making a gpt wedge on the large raid device is where things fail - you can do that, that becomes a dk wedge, and you can't boot off of that. You can't compile a kernel with root set to a dk wedge or with a NAME= specified.

So, instead, we use /dev/raid0d and newfs it while specifying the number of sectros manually. Since the kernel is loaded from the non-raid device, there's no instance where this doesn't work.


4) simply boot a kernel in GPT which is in RAIDframe which is in GPT.

This may not be that hard either (and I think it's entirely separate
From autoconf of root in gpt).  But it is probably more involved that
the disklabel method, which I think relies on the inside-raid a
partition being the one to use and starting at 0 in the raid virtual
disk.  One would have to skip the raid header and then recursively read
another gpt, and add back in the offset to the start.

That's the part I wish I understood better. Obviously, this would fix many, many scenarios which are sure to come up since it's just a matter of time before disks smaller than 2 TB aren't even made any more.


Another approach for users, while not really reasonable to recommend
becuase of these issues, is to have a RAID pair of moderate-sized SSDs
(eg. 256M) for root/swap/var/usr and then a pair of 4T for /home.




Another observation is that it would be really nice if ZFS was up to
date and worked.

Wouldn't ZFS be a bit overkill for mirroring two disks?

How does ZFS support booting from a ZFS pool?

John




Home | Main Index | Thread Index | Old Index