[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: port-i386/40449: disk errors after ACPI suspend/resume
The following reply was made to PR port-i386/40449; it has been noted by GNATS.
From: David Young <dyoung%pobox.com@localhost>
Cc: port-i386-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
Subject: Re: port-i386/40449: disk errors after ACPI suspend/resume
Date: Wed, 21 Jan 2009 12:50:43 -0600
On Wed, Jan 21, 2009 at 06:00:00PM +0000, apb%cequrux.com@localhost wrote:
> >Number: 40449
> >Category: port-i386
> >Synopsis: disk errors after ACPI suspend/resume
> >Confidential: no
> >Severity: serious
> >Priority: high
> >Responsible: port-i386-maintainer
> >State: open
> >Class: sw-bug
> >Submitter-Id: net
> >Arrival-Date: Wed Jan 21 18:00:00 +0000 2009
> >Originator: Alan Barrett
> >Release: NetBSD 5.99.1
> Not much
> System: NetBSD 5.99.10 i386
> Architecture: i386
> Machine: i386
> If I suspend the system via sysctl -w machdep.sleep_state=3 and
> then resume, a consant stream of disk error messages appears.
> The errors look like this:
> wd0e: error reading fsbn blah blah retrying
> wd0: (aborted command)
> cgd1: error 5
> There are several pairs of wd0e and wd0 messages for each cgd1 message.
> The block numbers in the wd0e messages repeat a few times and then
> change. The errors scroll past rapidly and continuously. The only
> obvious way to recover it to power cycle the machine.
> wd0 is an ordinary laptop SATA disk attached to an Intel
> 82801GBM/GHM controller (configured in the BIOS for compatibility
> mode). Here are some config messages:
> piixide0 at pci0 dev 31 function 2
> piixide0: Intel 82801GBM/GHM Serial ATA Controller (ICH7) (rev. 0x01)
> piixide0: bus-master DMA support present
> piixide0: primary channel wired to compatibility mode
> ioapic0: int14 0x69<vector=0x69,delmode=0x0,dest=0x0> 0x0<target=0x0>
> piixide0: primary channel interrupting at ioapic0 pin 14
> atabus0 at piixide0 channel 0
> wd0 at atabus0 drive 0: <Hitachi HTS542520K9SA00>
> wd0: drive supports 16-sector PIO transfers, LBA48 addressing
> wd0: 186 GB, 387621 cyl, 16 head, 63 sec, 512 bytes/sect x 390721968
> rnd: wd0 attached as an entropy source (collecting and estimating)
> wd0: 32-bit data port
> wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
> wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using
> The disk has an MBR and a NetBSD disklabel. wd0e is one of the disklabel
> cgd1 used wd0e as its backing store.
> suspend, then resume.
You may be able to narrow this down by using drvctl -S/-Q to
suspend/resume wd0 and its parents, beginning with wd0:
drvctl -S wd0; drvctl -Q wd0
drvctl -S atabus0; drvctl -Q atabus0
drvctl -S piixide0; drvctl -Q piixide0
drvctl -S pci0; drvctl -Q pci0
Let us see if one of those steps will reliably reproduce the problem.
If so, then it may help both to have a look at the affected devices'
PCI configuration before and after suspension/resumption, using
pcictl(8), and to look at the devices' suspend/resume routines.
It may be desirable to suspend cgd1 before suspending its backing
store. I don't know if cgd(4) suspends and resumes, though. Not
all disk drivers will refrain from trying to issue a read/write to
the h/w while suspended.
David Young OJC Technologies
dyoung%ojctech.com@localhost Urbana, IL * (217) 278-3933
Main Index |
Thread Index |