Subject: Re: NetBSD/i386 processor recommendation
To: Jonathan Stone <jonathan@dsg.stanford.edu>
From: Gary D. Duzan <gary@wheel.tiac.net>
List: port-i386
Date: 08/07/1997 07:57:00
In Message <199708071046.DAA20572@Pescadero.DSG.Stanford.EDU> ,
   Jonathan Stone <jonathan@DSG.Stanford.EDU> wrote:

=>
=>
=> >> The P6 does unncessary read-for-ownership cycles even when not in SMP mod
e.
=> >> It ruins the main memory write performance. However, the reads and in-cac
he
=> >> writes are so much faster that this only affects things like long bcopys.
=> >
=> >>Shouldn't bcopy and friends use a read after every cache-line full of data
=> >>written (considering alignement and all that fun)? IIRC that would cause a
=> >>burst-read of the cache-line, a fill in the cache in exclusive mode and th
en
=> >>a burst-write of the whole line back to main memory. Should be faster sinc
e
=> >>P5...
=>
=>Uh... I colllaborated on Kevin Lai's USENIX paper which points out the
=>advantages of a read-cache-line-before-write strategy on Pentium chips
=>with a no--alllocate-on-write-miss cache policy.
=>
=>I hacked up a bcopy() based on Kevin's and then ported the FreeBSD
=>bcopy(), which does maringally (or even, if you prefer) better by
=>using the FPU registers to issue 64-bit copies. 

   Note that this will actually slow down the AMD K6. The particular
FP instructions involved are not pipelined, so the copy loop is
stalling the pipeline the whole time. Quake uses that method for
doing memory copies, leading to the widely held belief that the K6
doesn't do FP/3D/games well.
   As for AMD K6 stability, I've been using the K6-200 with an
FIC PA-2011 motherboard (VIA Apollo VP2 chipset, AMI BIOS) and
things have gone pretty well. I've noticed a little strangeness
when doing builds where the compiler dies with some internal error
in the middle, but I'm uncertain as to the cause. A restart of the
build with UPDATE=1 generally works fine.  Otherwise it is fairly
speedy and stable. It's nice to be able to build a kernel from
scratch w/gdb copy in about 13 minutes -- probably faster if I ever
get around to installing a faster disk.

                                      Gary D. Duzan
                         Humble Practitioner of the Computing Arts