Subject: RE: "UNIX on Intel" panel discussion at BayLISA on 17 September
To: Cousins, Kevin <Kevin.Cousins@praxa.com.au>
From: Jan B. Koum <jkb@best.com>
List: netbsd-advocacy
Date: 09/23/1998 18:55:13
  by homeworld.cygnus.com with SMTP; 24 Sep 1998 01:57:02 -0000
	by shell6.ba.best.com (8.9.0/8.9.0/best.sh) with SMTP id SAA08392;
	Wed, 23 Sep 1998 18:55:13 -0700 (PDT)
Date: Wed, 23 Sep 1998 18:55:13 -0700 (PDT)
From: "Jan B. Koum " <jkb@best.com>
To: "Cousins, Kevin" <Kevin.Cousins@praxa.com.au>
cc: "'netbsd-advocacy@netbsd.org'" <netbsd-advocacy@NetBSD.ORG>
Subject: RE: "UNIX on Intel" panel discussion at BayLISA on 17 September
In-Reply-To: <FC9E4D476E24D21189130000F80049E010EB84@SYDINFO2>
Message-ID: <Pine.BSF.4.02A.9809231854510.7941-100000@shell6.ba.best.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Thu, 24 Sep 1998, Cousins, Kevin wrote:

>
>
>Are there any details on how this panel discussion went?  Jason?
>
>--Kevin.
>

>From rick@hugin.imat.com Wed Sep 23 18:54:46 1998
Date: Sun, 20 Sep 1998 23:44:05 -0700
From: Rick Moen <rick@hugin.imat.com>
To: baylisa@baylisa.org
Subject: BayLISA 17 Sept. 1998 meeting notes

[I took these notes for San Francisco FreeBSD User Group's
mailing list, and felt they might be appreciated here, too.]


BayLISA meeting notes
17 September 1998

The panel was in the form of 10-minute presentations by each
of the panelists, in turn, followed by a collective Q&A session
Some of them had a tendancy to use overhead foils with tiny, 
illegible print, the information in which was totally wasted,
under the circumstances.  Suggestion to presenters:  If you use
visual aids, stick to larger point sizes, and give the audience
a URL where they can look up your presentation materials.

To recap, this was to be a panel on how to set up/configure 
high-reliability, high-performance Internet servers on sundry 
Intel Unix/Unix-like OSes.


1.  Paul Vixie (Internet Software Consortium founder & much more,
re: BSDI's BSD OS):  Likes all of {Free|Net|Open}BSD,
but prefers BSD OS because it _doesn't_ change often.  He always
uses the last patch level of the prior major release, to maximise
stability.  For example, BSDI has now come out with 4.0, so he
runs the most recent (final) patches of 3.1.  He has several 
machines running routed, gated, and screend in Palo Alto, and 
some others running the T1 and doing kerberos authentication.  

He _doesn't_, however, run BSD OS for the root nameserver.  This
requires 1GB of RAM, which he figures he could easily accomplish
on NetBSD, but instead runs it on an Alpha running Digital Unix,
because Digital was kind enough to donate these.

He made reference to his fabled page (http://www.vix.com/pc-hw/)
of hardware recommendations for BSD OS, which has been much used
by BSDI and others.


2.  Jason Thorpe (NetBSD kernel developer).  Says you really need
256MB of RAM.  Likes the Mylex/Buslogic MultiMaster BT-958 SCSI
host adapter.  Adaptec 2940U is OK.  Recommends avoiding Adaptec
AIC-7890 chipsets, since the support isn't quite there, yet.  (An
audience member piped in that the "unstable" tree's driver does
this OK.)  Uses Seagate Hawk drives because they don't catch on fire
(no doubt referencing Seagate's "hot offering", the 10,000 RPM 
Barracuda series).  Likes Digital DEFPA FDDI adapters, Bay Networks
Netgear 10/100 ethernet NICs ($30 at Fry's), which are based on
the DEC Tulip chipset.  [RM adds:  Beware!  Very recently, NetGear
has started shipping units with its own chipsets that almost but
not quite emulate DEC's, without changing the model number or S/N 
series.]  Intel EtherExpress 10/100 NICs are OK, too.

Always stripes drives, and uses multiple SCSI host adapters.  Likes
serial consoles -- remote recovery.

NetBSD has a "packages" system similar to Jordan K. Hubbard's "ports" 
system in FreeBSD.  (They couldn't call it "ports" because that term 
already has a defined meaning.)

He listed a large number of daemons & network services for NetBSD
on barely-legible overhead foils, there & gone too quickly to take
notes, and then concluded by discussing NetBSD tuning, which I did
not attempt to transcribe.


3.  Matt Dillon (one of the founders of Best Internet, sometime Linux
guy, member of the FreeBSD core team):  Best has 45 FreeBSD rack-mount
hosts in production service.  They tried, at first, to use a couple 
of SGI hosts, which didn't work out.  Those cost $5 million, and were
replaced by the 45 PCs costing $200,000.  These are twice as efficient
and twice as fast.  Their first Intel production systems, back in the
SGI days, were Pentium 90 motherboards, but they didn't find these
to be robust.  The eventual Pentium Pro-based systems were a major
improvement, with ECC support and other improvements.  Eventually
replaced the SGIs with ASUS Pentium Pro 200, ECC RAM systems that
could hold 256 MB RAM, maximum.  

This system has serial console mode with kernel debugger.  Found he
didn't need multiple SCSI cards per host:  ISPs are seek-limited,
and don't push the 135MB/sec bandwidth of PCI.  Tagged queueing
and disconnect help user-concurrency issues.  (There's no way this
can be done with IDE.)  

In FreeBSD 3.0 (beta), the CAM layer fixes a vexing 2940UW hardware-FIFO
bug; uses DMA, avoids FIFO.  3.0 queues more SCBs (SCSI Control Blocks) 
per host adapter: uses 16-20 per host adapter (real-world conditions), 
instead of maximum of 4 in 2.2.x.  Elevator algorithm was helpful for 
a long time (historically), but modern SCSI disks don't necessarily 
store sectors in rotation order, so it's no longer useful.  SCBs 
therefore are of help in that area.

Likes Adaptec, Symbios/NCR, Mylex/Buslogic SCSI hosts, and almost any
10/100 ethernet: DEC, Intel ethernet chipsets.  Uses PCI only (avoid
VLB & ISA like the plague, EISA mostly obsolete), ECC SIMMs.  Uses
_some_ new Pentium IIs, finds CPU performance unimpressive, because
of slow L2 cache, but at least they have the advantage of holding more
RAM and having a greater number of sockets, so that you can have more
RAM on a system without having to use highest-density SIMMs/DIMMs.

Doesn't use NFS, no common file store, no RAID; one 100Base-T network.
Failures are rare, and only the occasional disk failures are serious:
In 1-2 cases, they required restoring from backup.  Ethernet goes to
a Cisco Catalyst switch.  FreeBSD has somewhat higher CPU overhead 
than NetBSD.  Having a 100Base-T backbone helps protect against 
denial-of-service attacks such as smurfing, which occurs a couple 
of times per week.

100Base-T also helps with tape backup.  The tape machine is the
only machine having two SCSI cards, in order that the tape chain be 
separate from the disk one.  This is also one of only two RAID 0
machines.  He's using Diablo (his own feed-only news server --
still in late beta) for netnews feed, on an SMP dual-processor
machine.  Using three 18-GB disk drives in striped configuration,
with soft updates enabled on the filesystem.  This is a test machine
for (among other things) FreeBSD beta, so it's running all possible
experimental FreeBSD code.  System configuration/tuning is similar
to NetBSD, but has a totally different virtual memory system (which
he detailed).


4. Jim Dennis (re: Linux -- "Linux Answer Guy" columnist in Linux 
Gazette):  Has used Linux since late 1991.  [RM adds:  Linux Torvalds
put out the first Linux kernels for public ftp in spring 1991.]  He
used Coherent before that.  [RM adds:  Yet another small Unix-like
OS, a low-cost proprietary offering from the Mark Williams Company,
now defunct.]  Before that, worked at Quarterdeck supporting 
DesqView on DOS.

Linux evolves rapidly, but it's not necessary to evolve with it.
(Anecdotes about machines running older Linux builds, with very
long uptimes.)  Typical uptime on his personal system is 3-4 months,
until he wants to change something fundamental, usually compiling a
new kernel.  Production systems, by contrast, have longer continuous
uptimes (but his point is that it's reliable enough that uptime 
measurements aren't significant).  Makes reference to the High 
Availability HOWTO document at the Linux Documentation Project (LDP):
http://sunsite.unc.edu/pub/Linux/ALPHA/linux-ha/High-Availability-HOWTO.html

Describes "Beowulf" clustering -- and stresses that its methods should
run on any *ix kernel, although it's most often been implemented on
Linux.  Moving to more down-to-earth concerns:  system monitoring,
capacity planning, alerts:  Round-robin DNS and traditional redundancies
via MX records, NIS master/slave setups, etc., will help in those areas.
Some new protocols, such as the Coda distributed filesystem, will help
in the future.  (Hopes that Coda will replace NFS.)

High-availability configurations are somewhat exotic, and Beowulf 
clustering does not benefity most most common business applications 
and services:  Calculation-intensive applications are rare -- 
astronomy and particle-physics simulations, rather than Web servers.

Linux 2.1.x development kernels have been underway for a very long time:
1.5 years.  (Described kernel-build process.)

Hardware requirements:  Will run on anything:  386 w/16 MB RAM suffices,
e.g., for router.  One fellow built a system with 2.1 kernel on 4 MB RAM.
Linux can also be run from a single floppy without hard drive:  References
Tom Oehser's "Tom's Root/Boot" floppy, and the Linux Router Project.

Any old PC piece o'junk will run Linux; for driver support on new 
(recently introduced) hardware, check the Red Hat or LDP hardware
compatibility lists.  Can get Linux preinstalled from many firms in
the Valley and elsewhere (PromoX, VA Research...).  Likes DEC Tulip
ethernet chipset, especially on NetGear cards.  Likes ASUS SMP
motherboards, Mylex/Buslogic BT-958 SCSI host adapters, watchdog
timer hardware (see below).

Recommends using just one SCSI host adapter, except for tape drives.
Likes new LM78 chipset, which monitors fan speed, voltages, and 
temperatures inside the case.  The WDT500-P and WDT501-P watchdog
timer interface hardware from Industrial Computer Source (San Diego)
is supported in the kernel (which provides /dev/watchdog), and 
can reboot or shut down the machine if it overheats, has voltage
problems, or fails other health checks.

Linux gotchas:  There are about a dozen distributions, at any given
time.  (Names a few.)  They differ in little persnickety details.  
Recommends picking one, installing it once (for each machine-role 
profile, e.g., Web, ftp, router, fileserver, and workstation), using 
the completed installation as a template to crank out others.

PC BIOS and hardware constraints are frustrating.  Multi-OS boot
setups are the source of a high percentage of the questions he's
asked.  Packages aren't as well integrated as in FreeBSD.  Performance
tuning:  "no atime" mount parameter on filesystems can double disk
throughput (which may not quite double throughput, but offers a
very noticeable speed-up).  Linux's native ext2 filesystem is very 
fast, even without that.  Linux supports a number of other filesystems,
and experimental ones are being developed for special purposes.
Recommends running ntpdate at startup and xntpd during operation for 
extremely accurate clock synchronisation.

Recommends checking "freshmeat.net" frequently, since a dozen or more
new or updated packages are posted there every day -- regardless of
which OS you use, since most code posted there is portable.  To be 
added to Linux, soon:  journaling filesystem, ACLs (much else that 
I couldn't copy down).


5.  Bob Palowoda (Solaris Performance Expert at Sun Microsystems, 
re: Solaris x86):  As part of his job, tests Xeon, Merced (which 
he can't talk about), Intel BX motherboards on Solaris vs. other 
Unixes.  Says that x86 hardware now rivals the SPARC systems for 
performance.  Classes "small" servers as dual Pentium Pros, "medium" 
as dual PII/400, "high-end" as quad-Xeon systems, e.g., NCR, Siemens, 
Fujitsu.  NCR has fastest spec Web server in the world, surpassing 
even a large IBM box in tests.  Solaris has good, fine-grained 
SMP, but there is a shortage of programmers able to take advantage 
of it.  Compares Solaris against *BSD and Linux on performance 
frequently as part of his job.

[Presents a series of illegible overhead foils with tiny type, 
purporting to show that Solaris's internal memory transfer rates,
local transfer rates, some other measures are faster than the
competition's.]

Disk drive tuning:  Transfer rates on UFS using Seagate Cheetahs,
15 MB/sec.  Using IBMs, 12 MB/sec.  Compares against Linux on 
pthread creation time, shows Linux to be much slower.  Linux and
FreeBSD do not yet have a well-developed SMP system.  Solaris
costs now $20 for personal use, which he personally resents, since
he doesn't think it appropriate to do that with the developers'
work.

Tuning parameters:  Solaris dynamically tunes a lot of them. 
(Details many others.)  Recommends Mylex host adapters w/3-5
channels and on-board cache RAM.  I20 bus is now supported on
Solaris 2.7.  Veritas journaling filesystem has just been ported,
and is very fast, but has been observed to flake out a few times.
(Logging is now supported on UFS.)

Describes Solaris for ISP bundle, with sundry packages including
HighWind news server, SSL, Java-based administration tools, LDAP,
both Andrew2 and WU IMAP servers, GSSAPI authentication, which
can deal both with real kerberos and Microsoft-gimmicked kerberos.


Joint Q&A session:

What is Best's failure rate on the x86 FreeBSD boxes?  Only 
occasional drive failures.  Ethernet cards, pre-ECC RAM occasionally.
A few power supplies, one fan, little things.  Everything except
disk drives can be recovered from essentially instantly.  (Modular
rack-mount design.)  Cheap PC parts allow keeping spares of everything
around -- not quite as reliable as SPARC.

Which filesystems have 64-bit support?  Solaris UFS has it, can be
up to 8 terabytes.  Jason reports similar results on NetBSD.  Matt:
fsck can take a long time on UFS, performance w/soft updates is
competitive and fixes the fsck problem.  Jim:  Linux can handle
very large filesystems on 64-bit CPU systems, eg., DEC Alpha.
On 32-bit systems, maximum per volume is 2 GB.

When will there be serial console for Linux?  Jim:  Already available
as patches for 2 years; built-in code in the 2.1.x development kernels.
(Solaris has it.)  A company in Canada, Canada Connect, is developing
add-on hardware to do _full_ serial-console, including BIOS Setup
access, e.g., Adaptec Ctrl-A.  Product is called "PC Weasel 2000" (?).

On Solaris, any chance of retrofitting support for new hardware into
version 2.5?  No, and 2.5 will soon be totally unsupported.

Each of the panelists represents a different approach to source-code
licencing.  Could each of you describe the advantages of your 
licence arrangements, and explain why you feel it's the right 
path for the future?  [The moderator whapped this person with a 
clue stick, and disallowed the question.]

Question about rack-mounting, which I didn't quite manage to 
transcribe:  Matt stressed that rack-mounting allows reliable,
short ultra-wide SCSI cabling, and good cooling for the disk
drives, both of which he considers key for reliable operation.

Is there an option to do remote machine builds?  Matt:  At Best,
we install from a template machine using an NFS-supporting boot 
floppy.  Jim:  Linux boot floppies can do NFS-client/bootp without 
modification.  (Can also do remote tape/other storage access via
rsh and tar or cpio.)

How do the various OSes handle Y2K?  Sun has a certification 
program.  NetBSD 1.2 (latest) has been somewhat tested, no
reports of new problems since then.  No known problems in 1.3.2
beta.  Linux:  As usual, it depends.  Pieces come from all manner
of origins.  All the GNU utilities have been fixed.  The kernel
doesn't have a problem.  Everything else is app-dependent.
Matt:  (Describes the fact that machines run ultra-accurate NTP,
for some reason.)  Seems to say that there are no known problems.
BSDI is eyeballing problem spots, and is now certifying BSD OS
for insurance purposes.  Real-time clocks in PC hardware are 
often unfixably defective for Y2K purposes.  

(Discussion about UPSes and recovery from extended power failures.)


This was a very long meeting, running to about 10:30 pm.  About
sixteen of us then adjourned to the Peppermill restaurant.