NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/49457: 'atactl sleep' loses drive



The following reply was made to PR kern/49457; it has been noted by GNATS.

From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: kern-bug-people%NetBSD.org@localhost, gnats-admin%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost,
        Hauke Fath <hauke%Espresso.Rhein-Neckar.DE@localhost>
Subject: Re: kern/49457: 'atactl sleep' loses drive
Date: Tue, 2 Apr 2019 17:33:42 +0200

 --Qxx1br4bt0+wmkIi
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 On Tue, Dec 09, 2014 at 04:40:00PM +0000, Hauke Fath wrote:
 > The following reply was made to PR kern/49457; it has been noted by GNATS.
 > 
 > From: Hauke Fath <hf%spg.tu-darmstadt.de@localhost>
 > To: gnats-bugs%NetBSD.org@localhost
 > Cc: kern-bug-people%NetBSD.org@localhost, gnats-admin%NetBSD.org@localhost,
 >         Hauke Fath <hauke%Espresso.Rhein-Neckar.DE@localhost>
 > Subject: Re: kern/49457: 'atactl sleep' loses drive
 > Date: Tue, 9 Dec 2014 16:30:20 +0100
 > 
 >  On Tue,  9 Dec 2014 10:10:06 +0000 (UTC), Manuel Bouyer wrote:
 >  >  >  wd1: wd_flushcache: status=0x5128<TIMEOU>
 >  >  >  wd1: IDENTIFY failed
 >  >  >  wd1: IDENTIFY failed
 >  >  >  
 >  >  >  I remember the "device timeout" line you quoted from netbsd-6, but have 
 >  >  >  not seen them on netbsd-7.
 >  >  
 >  >  Still not sure why the wd_flushcache() is there.
 >  >  What command above triggers which message?
 >  
 >  The 'atactl sleep' (which takes quite a while to return) triggers the 
 >  TIMEOU; the subsequent 'atactl checkpower' and 'disklabel' commands 
 >  trigger the 'IDENTIFY failed' messages.
 
 The fact that the flushcache command times out is suspect by itself. This
 command is issued before the device goes to sleep.
 
 It seems that to reproduce it, the disk needs to be unmounted when put
 to sleep. In this case, the WDF_LOADED is cleared on device close,
 and any attempt to open it afterward issues a wd_get_params() command.
 But this one doens't reset and retry in case of timeout, and the open fails.
 So there's no way to wake up the disk.
 
 While there I also implemented the DIRTY flag (copied from sd(4))
 so avoid flushing the cache if there has been no write.
 As a side effect, the atactl command on an unmounted disk doesn't trigger
 a cache flush, and so avoids the timeout.
 
 The attached patch fixes it for me on netbsd-7; it should also work for
 netbsd-8. Can you test ?
 
 -- 
 Manuel Bouyer <bouyer%antioche.eu.org@localhost>
      NetBSD: 26 ans d'experience feront toujours la difference
 --
 
 --Qxx1br4bt0+wmkIi
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=diff
 
 Index: sys/dev/ata/wd.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ata/wd.c,v
 retrieving revision 1.412.2.2
 diff -u -r1.412.2.2 wd.c
 --- sys/dev/ata/wd.c	5 Jul 2016 19:09:17 -0000	1.412.2.2
 +++ sys/dev/ata/wd.c	2 Apr 2019 15:16:11 -0000
 @@ -752,8 +752,12 @@
  		wd->sc_wdc_bio.flags |= ATA_LBA48;
  	if (wd->sc_flags & WDF_LBA)
  		wd->sc_wdc_bio.flags |= ATA_LBA;
 -	if (bp->b_flags & B_READ)
 +	if (bp->b_flags & B_READ) {
  		wd->sc_wdc_bio.flags |= ATA_READ;
 +	} else {
 +		/* it's a write */
 +		wd->sc_flags |= WDF_DIRTY;
 +	}
  	/* Instrumentation. */
  	disk_busy(&wd->sc_dk);
  	switch (wd->atabus->ata_bio(wd->drvp, &wd->sc_wdc_bio)) {
 @@ -961,7 +965,6 @@
  		}
  	} else {
  		if ((wd->sc_flags & WDF_LOADED) == 0) {
 -
  			/* Load the physical device parameters. */
  			if (wd_get_params(wd, AT_WAIT, &wd->sc_params) != 0) {
  				aprint_error_dev(wd->sc_dev,
 @@ -1014,7 +1017,8 @@
  {
  	struct wd_softc *wd = device_private(self);
  
 -	wd_flushcache(wd, AT_WAIT);
 +	if (wd->sc_flags & WDF_DIRTY)
 +		wd_flushcache(wd, AT_WAIT);
  
  	if (! (wd->sc_flags & WDF_KLABEL))
  		wd->sc_flags &= ~WDF_LOADED;
 @@ -1811,11 +1815,20 @@
  int
  wd_get_params(struct wd_softc *wd, u_int8_t flags, struct ataparams *params)
  {
 +	int retry = 0;
  
 +again:
  	switch (wd->atabus->ata_get_params(wd->drvp, flags, params)) {
  	case CMD_AGAIN:
  		return 1;
  	case CMD_ERR:
 +		if (retry == 0) {
 +			retry++;
 +			(*wd->atabus->ata_reset_drive)(wd->drvp,
 +			    flags | AT_RST_NOCMD, NULL);
 +			goto again;
 +		}
 +
  		if (wd->drvp->drive_type != ATA_DRIVET_OLD)
  			return 1;
  		/*
 @@ -1975,6 +1988,7 @@
  		    sbuf);
  		return EIO;
  	}
 +	wd->sc_flags &= ~WDF_DIRTY;
  	return 0;
  }
  
 Index: sys/dev/ata/wdvar.h
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ata/wdvar.h,v
 retrieving revision 1.40
 diff -u -r1.40 wdvar.h
 --- sys/dev/ata/wdvar.h	2 Feb 2012 19:43:02 -0000	1.40
 +++ sys/dev/ata/wdvar.h	2 Apr 2019 15:16:11 -0000
 @@ -59,6 +59,7 @@
  #define WDF_LBA		0x040 /* using LBA mode */
  #define WDF_KLABEL	0x080 /* retain label after 'full' close */
  #define WDF_LBA48	0x100 /* using 48-bit LBA mode */
 +#define	WDF_DIRTY	0x200 /* disk cache dirty */
  	u_int64_t sc_capacity; /* full capacity of the device */
  	u_int32_t sc_capacity28; /* capacity accessible with LBA28 commands */
  
 
 --Qxx1br4bt0+wmkIi--
 


Home | Main Index | Thread Index | Old Index