NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/40569: Faild RAIDframe parity rewrite prevents system shutdown
The following reply was made to PR kern/40569; it has been noted by GNATS.
From: Greg Oster <oster%cs.usask.ca@localhost>
To: Matthias Scheler <tron%zhadum.org.uk@localhost>
Cc: gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/40569: Faild RAIDframe parity rewrite prevents system
shutdown
Date: Wed, 11 Feb 2009 15:18:32 -0600
This is a multipart MIME message.
--==_Exmh_1234387000_241040
Content-Type: text/plain; charset=us-ascii
Matthias Scheler writes:
> On Wed, Feb 11, 2009 at 04:20:03PM +0000, Greg Oster wrote:
> > Are you planning to do more testing?
>
> Not really, but I could do.
If you could, that would be great... If you can't, no worries -- I
can nuke the LBA48 patch for the drives on my test box and attempt to
test this myself...
> > If so, I can get a patch for
> > the RAIDframe issue to you as well... (I think I have a patch that
> > will work, but I havn't validated it yet..)
>
> I would need that patch first because Manuel's patch is supposed to
> avoid the RAID rebuild issue in the first place.
See attached. (I think it's actually the last part of the patch that
will make the difference in your case, but the other two changes fix
issues too...)
Thanks!
Later...
Greg Oster
--==_Exmh_1234387000_241040
Content-Type: text/plain ; name="rf_reconstruct.c.diff"; charset=us-ascii
Content-Description: rf_reconstruct.c.diff
Index: rf_reconstruct.c
===================================================================
RCS file: /cvsroot/src/sys/dev/raidframe/rf_reconstruct.c,v
retrieving revision 1.106
diff -u -r1.106 rf_reconstruct.c
--- rf_reconstruct.c 20 Dec 2008 17:04:51 -0000 1.106
+++ rf_reconstruct.c 11 Feb 2009 19:33:43 -0000
@@ -676,8 +676,10 @@
done dealing with the reads that are
finished, we don't want to wait for any
writes */
- if (status == RF_RECON_WRITE_ERROR)
+ if (status == RF_RECON_WRITE_ERROR) {
write_error = 1;
+ num_writes++;
+ }
} else if (status == RF_RECON_READ_STOPPED) {
/* count this component as being "done" */
@@ -718,12 +720,13 @@
status = ProcessReconEvent(raidPtr, event);
if (status == RF_RECON_WRITE_ERROR) {
+ num_writes++;
recon_error = 1;
raidPtr->reconControl->error = 1;
/* an error was encountered at the very end...
bail */
} else if (status == RF_RECON_WRITE_DONE) {
num_writes++;
- }
+ } /* else it's something else, and we don't care */
}
if (recon_error ||
(raidPtr->reconControl->lastPSID == lastPSID)) {
@@ -1054,6 +1057,12 @@
case RF_REVENT_WRITE_FAILED:
retcode = RF_RECON_WRITE_ERROR;
+ /* This is an error, but it was a pending write.
+ Account for it. */
+ RF_LOCK_MUTEX(raidPtr->reconControl->rb_mutex);
+ raidPtr->reconControl->pending_writes--;
+ RF_UNLOCK_MUTEX(raidPtr->reconControl->rb_mutex);
+
rbuf = (RF_ReconBuffer_t *) event->arg;
/* cleanup the disk queue data */
--==_Exmh_1234387000_241040--
Home |
Main Index |
Thread Index |
Old Index