tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[patch] - add support for >2TB raid devices



hi folks.


i recently tried running raidframe on a pair of 3TB disks.  when
attached via USB, the reported 4K sectors and were signifcantly
more work required to get going, but when attached directly to
the sata ports, they come up support 512 sector accesses.

with a lot of help from mlelstv@ i've managed to get everything
seemingly working, at least for raid1.

this is what i'm running with now.  summary of changes:

        - add call to disk_blocksize() [from mlelstv]

        - add the sector size offset properly in a couple of places
        (this one isn't necessary for me now that i've put the disks
        internally and they report 512 byte sectors.) [from mlelstv]

        - use getdisksize() instead of home grown and wrong code
        [from mlelstv]

        - bump the component label version, and add two new values
        to store the high part of the partitionSize / numBlocks.

with this patch, and as long as i create the raid afterwards, i'm
able to use my 3TB raid mirror happily.


i'm going to commit this later this week after i'm done testing.
it has been reviewed by a couple of folks including oster@
(raidframe maintainer), this is really just a heads up.

i have a slightly different version of this in testing for netbsd-5
as well (netbsd-5 has no getdisksize(9)) and will send pullups for
that when i'm more confident.  you can find that patch at:

        http://www.netbsd.org/~mrg/rf-2tb-netbsd-5.diff

for anyone extremely keen.


.mrg.


Index: raidframevar.h
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/raidframevar.h,v
retrieving revision 1.13
diff -p -r1.13 raidframevar.h
*** raidframevar.h      17 Nov 2009 18:54:26 -0000      1.13
--- raidframevar.h      18 Oct 2010 21:51:42 -0000
*************** typedef struct RF_ComponentLabel_s {
*** 471,477 ****
                                 done first, (and would become raid0).
                                 This may be in conflict with last_unit!!?! */
                              /* Not currently used. */
!       int future_use2[44];  /* More future expansion */
  } RF_ComponentLabel_t;
  
  typedef struct RF_SingleComponent_s {
--- 471,479 ----
                                 done first, (and would become raid0).
                                 This may be in conflict with last_unit!!?! */
                              /* Not currently used. */
!       u_int numBlocksHi;    /* The top 32-bits of the numBlocks member. */
!       u_int partitionSizeHi;/* The top 32-bits of the partitionSize member. */
!       int future_use2[42];  /* More future expansion */
  } RF_ComponentLabel_t;
  
  typedef struct RF_SingleComponent_s {
Index: rf_copyback.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_copyback.c,v
retrieving revision 1.42
diff -p -r1.42 rf_copyback.c
*** rf_copyback.c       17 Nov 2009 18:54:26 -0000      1.42
--- rf_copyback.c       18 Oct 2010 21:51:42 -0000
*************** rf_CopybackReconstructedData(RF_Raid_t *
*** 213,218 ****
--- 213,221 ----
        c_label->row = 0;
        c_label->column = fcol;
        c_label->partitionSize = raidPtr->Disks[fcol].partitionSize;
+       if (c_label->version == RF_COMPONENT_LABEL_VERSION)
+               c_label->partitionSizeHi =
+                   raidPtr->Disks[fcol].partitionSize >> 32;
  
        raidflush_component_label(raidPtr, fcol);
  
Index: rf_disks.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_disks.c,v
retrieving revision 1.73
diff -p -r1.73 rf_disks.c
*** rf_disks.c  1 Mar 2010 21:10:26 -0000       1.73
--- rf_disks.c  18 Oct 2010 21:51:42 -0000
*************** rf_AutoConfigureDisks(RF_Raid_t *raidPtr
*** 455,460 ****
--- 455,463 ----
                        /* Found it.  Configure it.. */
                        diskPtr->blockSize = ac->clabel->blockSize;
                        diskPtr->numBlocks = ac->clabel->numBlocks;
+                       if (ac->clabel->version == RF_COMPONENT_LABEL_VERSION)
+                               diskPtr->numBlocks |=
+                                   (uint64_t)ac->clabel->numBlocksHi << 32;
                        /* Note: rf_protectedSectors is already
                           factored into numBlocks here */
                        raidPtr->raid_cinfo[c].ci_vp = ac->vp;
Index: rf_netbsdkintf.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_netbsdkintf.c,v
retrieving revision 1.274
diff -p -r1.274 rf_netbsdkintf.c
*** rf_netbsdkintf.c    8 Aug 2010 18:25:14 -0000       1.274
--- rf_netbsdkintf.c    18 Oct 2010 21:51:42 -0000
*************** raidinit(RF_Raid_t *raidPtr)
*** 1936,1941 ****
--- 1936,1942 ----
  
        disk_init(&rs->sc_dkdev, rs->sc_xname, &rf_dkdriver);
        disk_attach(&rs->sc_dkdev);
+       disk_blocksize(&rs->sc_dkdev, raidPtr->bytesPerSector);
  
        /* XXX There may be a weird interaction here between this, and
         * protectedSectors, as used in RAIDframe.  */
*************** raidstart(RF_Raid_t *raidPtr)
*** 2031,2037 ****
                 * partition.. Need to make it absolute to the underlying
                 * device.. */
  
!               blocknum = bp->b_blkno;
                if (DISKPART(bp->b_dev) != RAW_PART) {
                        pp = 
&rs->sc_dkdev.dk_label->d_partitions[DISKPART(bp->b_dev)];
                        blocknum += pp->p_offset;
--- 2032,2038 ----
                 * partition.. Need to make it absolute to the underlying
                 * device.. */
  
!               blocknum = bp->b_blkno << DEV_BSHIFT >> 
raidPtr->logBytesPerSector;
                if (DISKPART(bp->b_dev) != RAW_PART) {
                        pp = 
&rs->sc_dkdev.dk_label->d_partitions[DISKPART(bp->b_dev)];
                        blocknum += pp->p_offset;
*************** InitBP(struct buf *bp, struct vnode *b_v
*** 2283,2289 ****
        bp->b_error = 0;
        bp->b_dev = dev;
        bp->b_data = bf;
!       bp->b_blkno = startSect;
        bp->b_resid = bp->b_bcount;     /* XXX is this right!??!?!! */
        if (bp->b_bcount == 0) {
                panic("bp->b_bcount is zero in InitBP!!");
--- 2284,2290 ----
        bp->b_error = 0;
        bp->b_dev = dev;
        bp->b_data = bf;
!       bp->b_blkno = startSect << logBytesPerSector >> DEV_BSHIFT;
        bp->b_resid = bp->b_bcount;     /* XXX is this right!??!?!! */
        if (bp->b_bcount == 0) {
                panic("bp->b_bcount is zero in InitBP!!");
*************** rf_reasonable_label(RF_ComponentLabel_t 
*** 3114,3119 ****
--- 3123,3129 ----
  {
  
        if (((clabel->version==RF_COMPONENT_LABEL_VERSION_1) ||
+            (clabel->version==RF_COMPONENT_LABEL_VERSION_2) ||
             (clabel->version==RF_COMPONENT_LABEL_VERSION)) &&
            ((clabel->clean == RF_RAID_CLEAN) ||
             (clabel->clean == RF_RAID_DIRTY)) &&
*************** rf_reasonable_label(RF_ComponentLabel_t 
*** 3136,3141 ****
--- 3146,3157 ----
  void
  rf_print_component_label(RF_ComponentLabel_t *clabel)
  {
+       uint64_t numBlocks = clabel->numBlocks;
+ 
+       if (clabel->version == RF_COMPONENT_LABEL_VERSION) {
+               numBlocks |= (uint64_t)clabel->numBlocksHi << 32;
+       }
+ 
        printf("   Row: %d Column: %d Num Rows: %d Num Columns: %d\n",
               clabel->row, clabel->column,
               clabel->num_rows, clabel->num_columns);
*************** rf_print_component_label(RF_ComponentLab
*** 3146,3154 ****
               clabel->clean ? "Yes" : "No", clabel->status);
        printf("   sectPerSU: %d SUsPerPU: %d SUsPerRU: %d\n",
               clabel->sectPerSU, clabel->SUsPerPU, clabel->SUsPerRU);
!       printf("   RAID Level: %c  blocksize: %d numBlocks: %d\n",
!              (char) clabel->parityConfig, clabel->blockSize,
!              clabel->numBlocks);
        printf("   Autoconfig: %s\n", clabel->autoconfigure ? "Yes" : "No");
        printf("   Contains root partition: %s\n",
               clabel->root_partition ? "Yes" : "No");
--- 3162,3169 ----
               clabel->clean ? "Yes" : "No", clabel->status);
        printf("   sectPerSU: %d SUsPerPU: %d SUsPerRU: %d\n",
               clabel->sectPerSU, clabel->SUsPerPU, clabel->SUsPerRU);
!       printf("   RAID Level: %c  blocksize: %d numBlocks: %"PRIu64"\n",
!              (char) clabel->parityConfig, clabel->blockSize, numBlocks);
        printf("   Autoconfig: %s\n", clabel->autoconfigure ? "Yes" : "No");
        printf("   Contains root partition: %s\n",
               clabel->root_partition ? "Yes" : "No");
*************** rf_does_it_fit(RF_ConfigSet_t *cset, RF_
*** 3269,3274 ****
--- 3284,3291 ----
            (clabel1->maxOutstanding == clabel2->maxOutstanding) &&
            (clabel1->blockSize == clabel2->blockSize) &&
            (clabel1->numBlocks == clabel2->numBlocks) &&
+           (clabel1->version != RF_COMPONENT_LABEL_VERSION ||
+            clabel1->numBlocksHi == clabel2->numBlocksHi) &&
            (clabel1->autoconfigure == clabel2->autoconfigure) &&
            (clabel1->root_partition == clabel2->root_partition) &&
            (clabel1->last_unit == clabel2->last_unit) &&
*************** raid_init_component_label(RF_Raid_t *rai
*** 3533,3538 ****
--- 3550,3556 ----
  
        clabel->blockSize = raidPtr->bytesPerSector;
        clabel->numBlocks = raidPtr->sectorsPerDisk;
+       clabel->numBlocksHi = raidPtr->sectorsPerDisk >> 32;
  
        /* XXX not portable */
        clabel->parityConfig = raidPtr->Layout.map->parityConfig;
*************** rf_buf_queue_check(int raidid)
*** 3691,3713 ****
  int
  rf_getdisksize(struct vnode *vp, struct lwp *l, RF_RaidDisk_t *diskPtr)
  {
!       struct partinfo dpart;
!       struct dkwedge_info dkw;
        int error;
  
!       error = VOP_IOCTL(vp, DIOCGPART, &dpart, FREAD, l->l_cred);
!       if (error == 0) {
!               diskPtr->blockSize = dpart.disklab->d_secsize;
!               diskPtr->numBlocks = dpart.part->p_size - rf_protectedSectors;
!               diskPtr->partitionSize = dpart.part->p_size;
!               return 0;
!       }
! 
!       error = VOP_IOCTL(vp, DIOCGWEDGEINFO, &dkw, FREAD, l->l_cred);
        if (error == 0) {
!               diskPtr->blockSize = 512;       /* XXX */
!               diskPtr->numBlocks = dkw.dkw_size - rf_protectedSectors;
!               diskPtr->partitionSize = dkw.dkw_size;
                return 0;
        }
        return error;
--- 3709,3723 ----
  int
  rf_getdisksize(struct vnode *vp, struct lwp *l, RF_RaidDisk_t *diskPtr)
  {
!       uint64_t numsecs;
!       unsigned secsize;
        int error;
  
!       error = getdisksize(vp, &numsecs, &secsize);
        if (error == 0) {
!               diskPtr->blockSize = secsize;
!               diskPtr->numBlocks = numsecs - rf_protectedSectors;
!               diskPtr->partitionSize = numsecs;
                return 0;
        }
        return error;
Index: rf_raid.h
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_raid.h,v
retrieving revision 1.38
diff -p -r1.38 rf_raid.h
*** rf_raid.h   17 Nov 2009 18:54:26 -0000      1.38
--- rf_raid.h   18 Oct 2010 21:51:42 -0000
***************
*** 59,65 ****
  #endif                                /* RF_INCLUDE_PARITYLOGGING > 0 */
  
  #define RF_COMPONENT_LABEL_VERSION_1 1
! #define RF_COMPONENT_LABEL_VERSION 2
  #define RF_RAID_DIRTY 0
  #define RF_RAID_CLEAN 1
  
--- 59,66 ----
  #endif                                /* RF_INCLUDE_PARITYLOGGING > 0 */
  
  #define RF_COMPONENT_LABEL_VERSION_1 1
! #define RF_COMPONENT_LABEL_VERSION_2 2
! #define RF_COMPONENT_LABEL_VERSION 3
  #define RF_RAID_DIRTY 0
  #define RF_RAID_CLEAN 1
  
Index: rf_reconstruct.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_reconstruct.c,v
retrieving revision 1.108
diff -p -r1.108 rf_reconstruct.c
*** rf_reconstruct.c    17 Nov 2009 18:54:26 -0000      1.108
--- rf_reconstruct.c    18 Oct 2010 21:51:43 -0000
*************** rf_ReconstructFailedDiskBasic(RF_Raid_t 
*** 297,302 ****
--- 297,305 ----
                c_label->clean = RF_RAID_DIRTY;
                c_label->status = rf_ds_optimal;
                c_label->partitionSize = raidPtr->Disks[scol].partitionSize;
+               if (c_label->version == RF_COMPONENT_LABEL_VERSION)
+                       c_label->partitionSizeHi =
+                          raidPtr->Disks[scol].partitionSize >> 32;
  
                /* We've just done a rebuild based on all the other
                   disks, so at this point the parity is known to be


Home | Main Index | Thread Index | Old Index