Subject: kern/30194: unrecoverable wd(4) error after suspend/resume
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Lubomir Sedlacik <salo@Xtrmntr.org>
List: netbsd-bugs
Date: 05/11/2005 09:19:00
>Number:         30194
>Category:       kern
>Synopsis:       unrecoverable wd(4) error after suspend/resume, disk hangs
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed May 11 09:19:00 +0000 2005
>Originator:     Lubomir Sedlacik
>Release:        NetBSD 3.99.3, sources as of May 10 2005
>Environment:
System: NetBSD 3.99.3
Architecture: i386
Machine: i386
Model: IBM ThinkPad X40
>Description:
i am unable to resume from suspend with APM in the kernel.  when resuming, the
system comes up but disk seems to be hung immediately (sometimes the disk
activity led is constantly on), processes hang in vnlock (sometimes in
getblk).  kernel prints error:

 atabus0: resuming...
 atabus1: resuming...
 wd0a: error writing fsbn 15042624 of 15042624-15042655 (wd0 bn 15042687; cn 14923 tn 4 sn 51), retrying
 wd0: (aborted command)

and:

 wm0: device timeout (txfree 4058 txsfree 35 txnext 280

but the network comes up few seconds later and e.g., ssh connection is
resumed.

i tested various BIOS settings (e.g., disabling PCI Power Management) and
apm(4) options (e.g.,  APM_DISABLE_INTERRUPTS=0, APM_NO_IDLE) without any
effect.

my setup is a bit nonstandard, the whole hard drive is a cgd(4) volume on top
of wd0a; if it matters.  kernel boots from usb and while i am in single-user
in md(4), suspend/resume seems to work without any problems (until the system
is booted into multi-user involving disk activity, up until then the hard
drive is not touched apart of kernel probing it at boot time).  unfortunately,
i can't test with plain file system due to the nature of my installation.

>How-To-Repeat:
try to suspend/resume -current on X40,
see disk hang immediately after resuming

>Fix:
n/a

- missing power hook somewhere in the piixide(4) or wd(4) code?
- could cgd(4) affect the situation?