Subject: Very slow disk, part II.
To: None <netbsd-help@netbsd.org>
From: Richard Rauch <rauch@rice.edu>
List: netbsd-help
Date: 11/24/2002 14:26:44
About a year ago (sometime in December of 2001, I think, and continuing
into January at least), I started a thread re. my poorly-performing Maxtor
(IDE) 20GB drive.  I thought that I had said more about the ultimate (or
so I thought) resolution to that problem.  The problem has resurfaced,
with the same sorts of symptoms, so I went looking to see if there were
any open comments to close (in view of new insight, etc.).  I was thinking
that there were some more messages in the threads (e.g., bonnie++ results
from myself and others).  Oh well.


This message is in parts, seperated by whitespace.  The part at the end
has the real questions, but since there's some context from ~1 year ago,
I'm including a summary of the gestalt experience.  (^&  Feel free to skip
to the end if you don't think you need context.  (Without the summary, you
might have to go back to December of 2001 and look for my thread on "Very
slow disk" to know where the discussion already ran...)


First, how the situation appeared to resolve last year:

I tried getting a new disk controller on a PCI card (the motherboard
controller was provided by the infamous VIA chipset and was held suspect
by some).  This didn't help, but the card supported a better DMA mode than
the motherboard, so I kept the card anyway.

What I ultimately did to "fix" the problem last year was to get in touch
with Maxtor's warranty people (I found that the drive was still under
warranty) and after some haggling I got them to send out a replacement
drive.  (Incidental: They were not the most fun warranty people to deal
with.  They kept pushing for me to run their BillOS-only diagnostic tool.
I kept explaining that in order to run it, I'd have to pirate a copy of
BillOS, which I was sure they didn't want to endorse/promote.  They'd turn
around and ask me the results of the diagnostic tool.  I have to wonder if
they actually read what I wrote...  Or maybe it was an automated response,
at least in part?)

In any case, the replacement drive went in and worked like a charm.  I was
happy.  Performance of the disk drive was astounding compared to the old
drive that was being replaced.


The problem reappears:

Then I upgraded to NetBSD/i386 1.6.  Under 1.6, as many know, an attempt
is made to make use of more system memory for disk caching.  (I wish that
could be tuned down, BTW.  Oh well.)  The result is that when overnight
scripts run to check for .core files, etc., you wind up with loaded code
pages generally flushed from memory.  Continuing (or restarting) work from
day to day requires reloading applications from scratch.  So, that's when
I noticed (*really* noticed) that my disk was behaving poorly again.  It
also shows up when booting a system (reading the kernel has quite visible
stops and starts---and, once again, takes longer on my 800MHz Athlon tower
than with my 233mHz Pentium (plain Pentium) laptop).

Hm.


Well, we know the solution to *that* problem: Get a new disk.  (Grumble.)

So, I didn't want to deal with Maxtor's warranty people again.  I'd be
stuck with extra time, trouble, and a $15 shipping bill to return the
disk.  And if the disk starts to sour on me again in another 8 to 12
months (by which time I think that the warranty will be expired), then
what do I do?

So I just decided to *buy* a *different* brand of disk.  Specifically, a
Western Digital disk (I've got a Western Digital in a 5 year old Gateway
2000 that has run 24/7 for about 5 years and has never had these problems
that back-to-back Maxtors have had for me.)

Due to a recent run on "small" disks in the area, I found that a 40GB disk
was the smallest Western Digital that I could get.  The smallest, at least
at MicroCenter: I wasn't inclined to look around much more and probably
wouldn't have improved much on that without going mail-order anyway.  I
got their last sub-60GB Western Digital drive until more stock comes in.

Anyway, the disk works nicely.  But, then, so did both of the Maxtors,
when they were new.


Benchmarks:

(Sorry, not running anything to let me clip directly.)

As with a year ago, *some* bonnie++ benchmarks are showing quite good
results with the Maxtor.  But others are abysmal.  Seeks are at 5/sec
(yes, five seeks per second) with the Maxtor.  The new WD drive weighs in
at 140 seeks/sec.  On sequential read, the Maxtor is these days getting a
bit under 300K (yes, three hundred kilobytes) per second, while the new WD
is getting about 37MB/second (yes, over 100 times faster).  On the other
hand, Block, and character, sequential writes are fairly comparable (the
WD drive is about 1.5 times as fast).  "Random create: Read" are very
close to 1000 for both disks (1008 for the WD, 997 for the Maxtor).

This seek-and-sequential-read problem is very similar to what I was able
to observe a year ago.  And, when the new Maxtor was dropped in, it
behaved a *whole* lot better.  (Both subjectively, and on bonnie++
benchmarks, the replacement Maxtor was highly acceptable by me when it was
new.)

I probably didn't notice degredation of performance until recently because
NetBSD 1.5 was a whole lot better (for the old Maxtor, anyway; (^&) about
caching.  On a 128MB RAM system, I hardly ever needed to go substnatially
to disk in 1.5 (except for rebuilding the locate database, etc., and when
first starting up applications or rebooting).  Under 1.6, the nightly
scripts do a pretty nasty job of flushing all of the loaded code pages
(and application memory not backed by executable images get flushed to
swap), so the behavior has deteriorated substantially, for me, in 1.6.


I have some questions, then:

 * Is it likely that my power supply is giving out low voltage and the
   disk drive (after aging a bit) gets finicky about the voltage?  The
   system was originally put together for me by a mail-order company,
   3 years ago.  It's supposed to be a "Deer Computer Co. Ltd." 300 Watt
   power supply.  I wouldn't think that I have that much in there to
   draw power very heavily (e.g., a DVD drive that is 99% of the time
   both empty and idle; an old 4MB PCI video card; etc...).

   Maybe one of these days I'll get stuff cleared of ol' reliable
   (Prometheus, a 5-year-old PII (^&) and drop the Maxtor in there,
   to see if it behaves any better/worse...

 * If it turns out that a weak power supply is to blame, is it possible
   that the old drive was damaged by the power supply, or would it be
   "as good as new" if it were given a new power supply?

 * Has anyone seen this kind of selective behavior before (with a
   Maxtor or anything else)?  Is it indicative of nearing failure,
   or is it just an annoyance that should level off (or perhaps has
   already done so)?  E.g., if I keep the drive in the computer and
   use it for backup, is there reason to believe that that would be
   a good (or bad) idea?

   (A year ago, someone suggested that it might be indications of an
   approaching media failure---I don't know if he meant literally the
   substrate on the disk, or the disk as a whole.)


  ``I probably don't know what I'm talking about.'' --rauch@math.rice.edu