NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-i386/40449: disk errors after ACPI suspend/resume



The following reply was made to PR port-i386/40449; it has been noted by GNATS.

From: David Young <dyoung%pobox.com@localhost>
To: apb%cequrux.com@localhost
Cc: port-i386-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
        netbsd-bugs%netbsd.org@localhost, gnats-bugs%netbsd.org@localhost
Subject: Re: port-i386/40449: disk errors after ACPI suspend/resume
Date: Wed, 21 Jan 2009 12:50:43 -0600

 On Wed, Jan 21, 2009 at 06:00:00PM +0000, apb%cequrux.com@localhost wrote:
 > >Number:         40449
 > >Category:       port-i386
 > >Synopsis:       disk errors after ACPI suspend/resume
 > >Confidential:   no
 > >Severity:       serious
 > >Priority:       high
 > >Responsible:    port-i386-maintainer
 > >State:          open
 > >Class:          sw-bug
 > >Submitter-Id:   net
 > >Arrival-Date:   Wed Jan 21 18:00:00 +0000 2009
 > >Originator:     Alan Barrett
 > >Release:        NetBSD 5.99.1
 > >Organization:
 > Not much
 > >Environment:
 > System: NetBSD 5.99.10 i386
 > Architecture: i386
 > Machine: i386
 > >Description:
 > 
 > If I suspend the system via sysctl -w machdep.sleep_state=3 and
 > then resume, a consant stream of disk error messages appears.
 > The errors look like this:
 > 
 > wd0e: error reading fsbn blah blah retrying
 > wd0: (aborted command)
 > cgd1: error 5
 > 
 > There are several pairs of wd0e and wd0 messages for each cgd1 message.
 > The block numbers in the wd0e messages repeat a few times and then
 > change.  The errors scroll past rapidly and continuously.  The only
 > obvious way to recover it to power cycle the machine.
 > 
 > wd0 is an ordinary laptop SATA disk attached to an Intel
 > 82801GBM/GHM controller (configured in the BIOS for compatibility
 > mode).  Here are some config messages:
 > 
 >     piixide0 at pci0 dev 31 function 2
 >     piixide0: Intel 82801GBM/GHM Serial ATA Controller (ICH7) (rev. 0x01)
 >     piixide0: bus-master DMA support present
 >     piixide0: primary channel wired to compatibility mode
 >     ioapic0: int14 0x69<vector=0x69,delmode=0x0,dest=0x0> 0x0<target=0x0>
 >     piixide0: primary channel interrupting at ioapic0 pin 14
 >     atabus0 at piixide0 channel 0
 > 
 >     wd0 at atabus0 drive 0: <Hitachi HTS542520K9SA00>
 >     wd0: drive supports 16-sector PIO transfers, LBA48 addressing
 >     wd0: 186 GB, 387621 cyl, 16 head, 63 sec, 512 bytes/sect x 390721968 
 > sectors
 >     rnd: wd0 attached as an entropy source (collecting and estimating)
 >     wd0: 32-bit data port
 >     wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
 >     wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using 
 > DMA)
 > 
 > The disk has an MBR and a NetBSD disklabel.  wd0e is one of the disklabel
 > partitions.
 > 
 > cgd1 used wd0e as its backing store.
 > 
 > >How-To-Repeat:
 > suspend, then resume.
 
 You may be able to narrow this down by using drvctl -S/-Q to
 suspend/resume wd0 and its parents, beginning with wd0:
 
         drvctl -S wd0; drvctl -Q wd0
         drvctl -S atabus0; drvctl -Q atabus0
         drvctl -S piixide0; drvctl -Q piixide0
         drvctl -S pci0; drvctl -Q pci0
 
 Let us see if one of those steps will reliably reproduce the problem.
 If so, then it may help both to have a look at the affected devices'
 PCI configuration before and after suspension/resumption, using
 pcictl(8), and to look at the devices' suspend/resume routines.
 
 It may be desirable to suspend cgd1 before suspending its backing
 store.  I don't know if cgd(4) suspends and resumes, though.  Not
 all disk drivers will refrain from trying to issue a read/write to
 the h/w while suspended.
 
 Dave
 
 -- 
 David Young             OJC Technologies
 dyoung%ojctech.com@localhost      Urbana, IL * (217) 278-3933
 


Home | Main Index | Thread Index | Old Index