port-mac68k: Re: ppp and disk errors

Subject: Re: ppp and disk errors
To: Black Jaguar <blackjag@aksi.net>
From: David A. Gatwood <dgatwood@mvista.com>
List: port-mac68k
Date: 09/14/1999 13:26:16
On Tue, 14 Sep 1999, Black Jaguar wrote:

> Hi all.  I have recently installed 1.4 on a 540c with an LC040.  I'm
> running the generic SBC kernel.  Aside from the expected segmentation
> faults and other 040 garbage, I have been having two problems.  First, at
> various disk accesses, I get the following error.  It seems that, after the
> data has been accessed once, the fault is corrected.  The error never seems
> to occur in the same place twice.  I am interested in learning what all of
> these acronyms mean, and the nature of this error.
> 
> sd0(sbc0:0:0):  Check Condition on CDB: 0x08 03 20 a0 10 00
>     SENSE KEY:  Recovered Error
>    INFO FIELD:  204972
>  COMMAND INFO:  38667035 (0x24e031b)
>      ASC/ASCQ:  Recovered Data With No Error Correction Applied
>          SKSV:  Actual Retry Count: 1

Now that's a nice loooking debug message.  Much better than "Target 0:
sensed a recovered error...".  :-)

Here's a quick explanation, as I understand it.  With hard drives in
general, and particularly with removable drives, things like temperature,
barometric pressure, humidity, etc. can affect the alignment of the heads
on the platters.  If it gets far enough off, reads will occasionally fail
and the hardware will attempt to recover from the error by re-reading the
block.  In this case, it's only having to try again once.  This would be
classified as a soft error.  On a removable disk, these can be very
frequent.  On a fixed disk, these usually indicate that a block on the
disk is going bad and should probably be mapped out of use.

Most OSes, it seems, ignore recovered errors when the SCSI hardware sends
them, and at least with MacOS, most hardware is configured by the MacOS
drivers not to even _report_ recovered (soft) errors.

I can't speak for NetBSD, but I know that MkLinux (using lots of NetBSD's
drivers) has trouble with Zip drives doing this same sort of thing, but
only when the Iomega driver is used on the MacOS side.  If this is a
removable drive, you might try using Apple's driver on the disk and see if
that helps.

If it's a fixed disk, I'd recommend a good bad blocks check, and possibly
a low-level format, because that seems like either a block going bad or,
if it's happening on lots of blocks, possibly a track alignment problem,
but that's just a guess there.

Take everything above with a grain of salt, though.  I try to avoid even
thinking about SCSI internals.



> My second problem deals with ppp.  I have tried various configurations, but
> to no avail.  The laptop has an internal modem, but I believe tty00 is the
> external modem, as it was successfully dialing out.  My log is as follows:

<snip>

> Sep  6 17:23:02  pppd[193]: Serial connection established.
> Sep  6 17:23:03  pppd[193]: Using interface ppp0
> Sep  6 17:23:03  pppd[193]: Connect: ppp0 <--> /dev/tty00
> Sep  6 17:23:03  pppd[193]: Modem hangup
> Sep  6 17:23:03  pppd[193]: Connection terminated.
> Sep  6 17:23:06  pppd[193]: Exit.

Try adding either local or clocal to the options.  I can't remember which.
Or maybe -local or -clocal....  Anyway, it has to do with the way the
handshaking lines are wired on Macs and the lack of the line that
indicates a modem hung up.  I forget the details, but I've seen this under
nearly every Mac operating system I've used other than MacOS....


Later,
David