Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: The read and write timeouts story, next stage...



I did some more testing. I installed FreeBSD 8.0-RELEASE on the original
two Seagate drives, using gmirror to have a similar RAID-1 setup as
I had with NetBSD and RAIDframe.

I mirrored the other drive and run through "make -j4 universe" on my Core 2 server (Intel ICH8) without problems. With NetBSD I got problems with RAIDframe parity checking. Then I moved the drives to the original Mini-ITX server (Intel ICH7) and ran "make -j8 universe" which lasted 21+ hours and in parallel I installed and ran bonnie++ two times with 10 loops each run.

I changed the geom algorithm from "round-robin" to "split" in the
middle of the 21 hour make session. I saw a lot of 90+ % loads on
both drives with 'gstatus'. Top showed loads of 7.x all the time.

I did not get any errors from any hardware component. That is the
exact same hardware setup that I can reproduce problems with NetBSD
when I run "build.sh release".

I have not yet studied the FreeBSD source code but it seems that
the software makes a difference here. I don't know if FreeBSD has
some cludges to bypass some hardware bugs. However, it seems to work
so far from the system admin point of view..

BR,
Teemu

On 20.12.2009 14:19, Teemu Rinta-aho wrote:
Hi all,

I have been reporting here my battle with the read and write timeouts
from SATA drives. Now I managed to try the drives on a 3rd PC, where
completely everything is different from the previous 2 PCs with which
I have had the problem. And the problem is still there! (The third PC
is about one month old, with Asus P7P55D motherboard, Core i7 860 CPU,
750W PSU etc.)

It seems that I can reproduce write timeouts with the Intel Mini-ITX
by building the NetBSD source. On Core 2 I get read timeouts when
I let the raidframe do a parity check on the RAID-1 set. On Core i7
I got a write timeout when I did a rebuild of the other drive in the
RAID-1 set.

Now I am bored of finding new pieces of hardware to try to make the
problem go away and was wondering if anyone has a similar setup that
I am having and could try if he or she can reproduce the problem.

My setup is: two 1TB SATA2 drives configured to RAID-1 using
raidframe. It doesn't seem to make a difference whether the driver
is piixide or ahcisata, I have tried to change BIOS settings etc.
I have all the NetBSD partitions on that RAID device, including
root. Then by either building a NetBSD release or causing the
raidframe to do parity check (by e.g. pressing the reset button)
I can reproduce the problem.

The only common nominators in my tests are: the power coming from
the wall, the UPS, and the NetBSD code.

I will try -current with the Core i7 machine next.

BR,
Teemu



Home | Main Index | Thread Index | Old Index