NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/40569: Faild RAIDframe parity rewrite prevents system shutdown



The following reply was made to PR kern/40569; it has been noted by GNATS.

From: Greg Oster <oster%cs.usask.ca@localhost>
To: Matthias Scheler <tron%zhadum.org.uk@localhost>
Cc: gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/40569: Faild RAIDframe parity rewrite prevents system 
shutdown 
Date: Wed, 11 Feb 2009 15:18:32 -0600

 This is a multipart MIME message.
 
 --==_Exmh_1234387000_241040
 Content-Type: text/plain; charset=us-ascii
 
 Matthias Scheler writes:
 > On Wed, Feb 11, 2009 at 04:20:03PM +0000, Greg Oster wrote:
 > >  Are you planning to do more testing?
 > 
 > Not really, but I could do.
 
 If you could, that would be great... If you can't, no worries -- I 
 can nuke the LBA48 patch for the drives on my test box and attempt to 
 test this myself...
 
 > >  If so, I can get a patch for 
 > >  the RAIDframe issue to you as well...  (I think I have a patch that 
 > >  will work, but I havn't validated it yet..)
 > 
 > I would need that patch first because Manuel's patch is supposed to
 > avoid the RAID rebuild issue in the first place.
 
 See attached.  (I think it's actually the last part of the patch that 
 will make the difference in your case, but the other two changes fix 
 issues too...)
 
 Thanks!
 
 Later...
 
 Greg Oster
 
 
 --==_Exmh_1234387000_241040
 Content-Type: text/plain ; name="rf_reconstruct.c.diff"; charset=us-ascii
 Content-Description: rf_reconstruct.c.diff
 
 Index: rf_reconstruct.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/raidframe/rf_reconstruct.c,v
 retrieving revision 1.106
 diff -u -r1.106 rf_reconstruct.c
 --- rf_reconstruct.c   20 Dec 2008 17:04:51 -0000      1.106
 +++ rf_reconstruct.c   11 Feb 2009 19:33:43 -0000
 @@ -676,8 +676,10 @@
                                   done dealing with the reads that are
                                   finished, we don't want to wait for any
                                   writes */
 -                              if (status == RF_RECON_WRITE_ERROR)
 +                              if (status == RF_RECON_WRITE_ERROR) {
                                        write_error = 1;
 +                                      num_writes++;
 +                              }
                                
                        } else if (status == RF_RECON_READ_STOPPED) {
                                /* count this component as being "done" */
 @@ -718,12 +720,13 @@
                        status = ProcessReconEvent(raidPtr, event);
                        
                        if (status == RF_RECON_WRITE_ERROR) {
 +                              num_writes++;
                                recon_error = 1;
                                raidPtr->reconControl->error = 1;
                                /* an error was encountered at the very end... 
bail */
                        } else if (status == RF_RECON_WRITE_DONE) {
                                num_writes++;
 -                      }
 +                      } /* else it's something else, and we don't care */
                }
                if (recon_error || 
                    (raidPtr->reconControl->lastPSID == lastPSID)) {
 @@ -1054,6 +1057,12 @@
        case RF_REVENT_WRITE_FAILED:
                retcode = RF_RECON_WRITE_ERROR;
  
 +              /* This is an error, but it was a pending write.
 +                 Account for it. */
 +              RF_LOCK_MUTEX(raidPtr->reconControl->rb_mutex);
 +              raidPtr->reconControl->pending_writes--;
 +              RF_UNLOCK_MUTEX(raidPtr->reconControl->rb_mutex);
 +
                rbuf = (RF_ReconBuffer_t *) event->arg;
  
                /* cleanup the disk queue data */
 
 --==_Exmh_1234387000_241040--
 
 


Home | Main Index | Thread Index | Old Index