Subject: Re: Why is Samba so much slower on NetBSD than FreeBSD?
To: None <current-users@netbsd.org, rhialto@polder.ubc.kun.nl>
From: Olaf Seibert <rhialto@polder.ubc.kun.nl>
List: current-users
Date: 10/28/1998 14:50:00
In <3630D412.3FD9227E@bk.bosch.de> Guenther Grau <Guenther.Grau@bk.bosch.de> writes:

>unfortunately I forgot who actually wrote this report,
>but I hope he's is reading this list. Have you got
>enough disk space to install a minimum FreeBSD installation
>on the slow responding system? If so, I'd suggest you
>try to install FreeBSD on it and see how it performs.
>
>This comparison should be able to tell if NetBSD
>or the hardware is slow with samba. You should 
>try to make the situation as comparable as possible,
>i.e. running the same programs, having the same
>amount of users logged in (or try to simulate
>the situation) while trying to access samba.

I did a similar but different test. I installed NetBSD on the fastest
machine (the one that already had FreeBSD on it)[1] and took some
measurements. The result is that NetBSD 1.3.2 seems to be some 15 to 20%
slower on the same hardware.

Because we needed a FreeBSD machine to do some compilation on, I
installed it (FreeBSD 2.2.7 again) on another slow machine: CPU: AMD
Am5x86 Write-Back (486-class CPU), "pentium-75 rating" with 16M ram. The
performance of this is really horrible: 20 K/s at best (using the same
test file). I fear NetBSD wouldn't do any better here.

This machine has an old 545 M IDE disk, and yet again a 3Com 3c509 card.
I tested the transmit fifo underrun theory by adding a printf in an
appropriate location, and it got triggered exactly once. On the
production NetBSD 1.2.1 machine (polder), I turned on IFF_DEBUG on ep0,
which should already print out TX and RX over- and underruns. So far,
there are no reports.

The 3c509 driver sources are divided over several files differently in
NetBSD and FreeBSD, but they show obvious common heritage. The function
I was most interested in, epstart, looked a bit different in what they
do after sending a packet into the fifo though.

All my tests have indicated that the speed difference is caused by
differences on the samba server side, where CPU speed counts heavily for
*BSD, and a lot less for WNT .

When I did a ktrace on the samba daemon on the new slow machine, I got
repeated patterns like this:

(kdump -R -m 80)

   204 smbd     0.000269 CALL  select(0xff,0xefbfdd40,0,0,0xefbfdd14)
   204 smbd     1.396783 RET   select 1
   204 smbd     0.000271 CALL  read(0x5,0x7f001,0x4)
   204 smbd     0.000160 GIO   fd 5 read 4 bytes
       "\0\0\0003"
   204 smbd     0.000096 RET   read 4
   204 smbd     0.000099 CALL  read(0x5,0x7f005,0x33)
   204 smbd     0.000117 GIO   fd 5 read 51 bytes
       "\M^?SMB\^Z\0\0\0\0\b\^A\0\0\0\0\0\0\0\0\0\0\0\0\0001\0\M-S\^Wd\0007\^X\
        \bR\0\M-}\M^?\^B\0\M^?\M^?\0\0\M^?\M^?\M^?\M^?\0\0\0\0"
   204 smbd     0.000095 RET   read 51/0x33
   204 smbd     0.000253 CALL  gettimeofday(0xefbfdd28,0)
   204 smbd     0.000131 RET   gettimeofday 0
   204 smbd     0.000218 CALL  lseek(0x7,0,0x2fffd,0,0)
   204 smbd     0.000112 RET   lseek 196605/0x2fffd
   204 smbd     0.000086 CALL  read(0x7,0x90005,0xffff)
   204 smbd     0.007170 GIO   fd 7 read 65535 bytes
       "\0h1\M-s\0\0\M^MM\M-p\M-h\M-c\M-_\^O\0\M^E\M^?u\^E\M-?\M^V\0\0\0j\^Bj\
        \^Aj!\M^KE\M-pj\0\M^KNxj\0Wj\^BP\M-h\^S\M-@\v\0h2\M-s\0\0\M^MM\M-p\M-h\
        \M-4\M-_\^O\0\M^C}\M-l\0\M^KE\M-lu\^E\M-8}\0\0\0j\^Bj\^Aj"
   204 smbd     0.030202 RET   read 65535/0xffff
   204 smbd     0.003981 CALL  write(0x5,0x90001,0x10003)
   204 smbd     0.080134 GIO   fd 5 wrote 65539 bytes
       "\0\0\M^?\M^?\0h1\M-s\0\0\M^MM\M-p\M-h\M-c\M-_\^O\0\M^E\M^?u\^E\M-?\M^V\
        \0\0\0j\^Bj\^Aj!\M^KE\M-pj\0\M^KNxj\0Wj\^BP\M-h\^S\M-@\v\0h2\M-s\0\0\
        \M^MM\M-p\M-h\M-4\M-_\^O\0\M^C}\M-l\0\M^KE\M-lu\^E\M-8}\0\0\0j"
   204 smbd     0.036707 RET   write 65539/0x10003

(this was on the FreeBSD machine, but at this point I'm thinking that
whatever slowness there is, is mostly in common between Net and FreeBSD.)

Note the long time that select(2) takes! I already saw that with top.
Somehow the smbd manages to wait for eternities on the network, which is
NOT consistent with the theory that all slowness is caused by smbd, not
smbclient. The only consistent explanation can be that the select takes
much longer than it needs to. Perhaps this is caused by the unreasonably
large first argument? (no, I changed it, and I noticed no positive
effect)

I am getting more and more confused about all this.

>  Guenther

[1] The NetBSD install procedure is REALLY a disaster in this case.
There was endless confusion about partition ids, BIOS and fdisk
geometries, and disklabels. Somehow, during the many retried installation
attempts, the NetBSD disklabel must have overwritten the FreeBSD
disklabel. I thought both were supposed to be contained inside their own
fdisk partition, but apparently not. In effect, the FreeBSD installation
on that machine is now toasted, since I didn't have a hardcopy of its
disklabel.

-Olaf.
--
___ Olaf 'Rhialto' Seibert - rhialto@polder.ubc. ---- Unauthorized duplication,
\X/ .kun.nl ---- while sometimes necessary, is never as good as the real thing.