Subject: Re: Has importing the FreeBSD malloc(3) been considered?
To: None <tech-userlevel@netbsd.org>
From: R. C. Dowdeswell <elric@mabelode.imrryr.org>
List: tech-userlevel
Date: 02/05/1999 16:09:00
Well, I sent an email out to the list with a few more stats on my
little look at different malloc(3)'s, and since I included a large
attachment it was rejected.  The attachment was the program that
I wrote to run the tests with, and it is not included this time.
I am still thinking about extending it a little, and will post a
URL at some point.  (Suffice it to say it was the same one that
I posted last time, with a getrusage() at the end to derive more
numbers...  And a build environment to make things more automatic,
and an awk script to format the results into tables.

Another note is that I recompiled the entire system to use FreeBSD
malloc and things were progressing without problem for a while, but
there did arise one difficulty:  fsck_ffs was coring consistently
on one of my file systems, and when I recompiled it with NetBSD
malloc it stopped.  So something odd is going on there.

Another note is that on NetBSD-current/alpha, the FreeBSD malloc(3)
beats ours on all of the tests, yet on NetBSD-1.3.1/i386, the
FreeBSD malloc(3) is a bit slower until memory is over-committed.
And the same on the shark.  (Although the shark basically died when
it ran out of memory, saying that there weren't enough slots in
the process map, or something, I need to rebuild the world over
there and see if that still happens.  In fact looking at the numbers
on the shark it is probably completely invalid.)

But, without much further ado:

So, here is the (slightly rewritten) previous email.

After learning of the existence of dlmalloc and dmalloc (thanks to
Simon Burge), I decided to revamp the test proggie a little and
throw at least one of them into the fray.  I chose dlmalloc, since
I am testing performance, and not debugging.

I also looked into Simon's suggestion of using getrusage() to print
some more detailed information.  I did notice that all of the memory
related fields were left zero, however.  I probably missed something
on that one.  Anyway, so the test program now reports real time,
user time and system time, along with page reclaims, page faults,
and voluntary context switches.  I think that this makes it a lot
more useful.  (Next I might look into actually trying to simulate
something that is a bit closer to "real world" use of malloc(3).)

I think that perhaps there is a little tweaking that could be done
to dl, since it was clearly behind both FreeBSD's malloc(3) and
GNU's malloc(3) on everything that I measured.  (Although, I'll
admit that I didn't measure memory usage, yet.)  And since it has
a pretty good reputation, this is odd.

A note is, though, I saw a web page that compared a bunch of
malloc(3)'s, and I noticed that they were completely preoccupied
with user time and system time.  As you will undoubtedly notice
below that even though all of these tests were not competing against
any other apps:
 user time + system time < real time
And once paging is involved user time + system time doesn't even
compete with real time.

If anyone wants to play with the program that I wrote, I'll send
you a copy (I'll be cleaning it up a little, too), and I'd definitely
appreciate hearing any comments that you have or results that you
might want to provide.

Here are the results that I am now seeing (trying with more
malloc(3)'s, and on more different machines).  The N/A's are "I
didn't want to wait for a test that would obviously take forever...".
standard is a synonym for libc, i.e. no replacement has taken place.
(On the alpha I have completely replaced it, so I used vi to change
the ./standard to ./FreeBSD.)

[ before you interpret the numbers along the top row of each table,
  please refer to the program attached to my previous mail to this
  list as they are not strictly kilobytes or megabytes or pages.
  They are "number of malloc(3)'ed regions of various sizes". ]

On an AlphaStation 200 4/233 w/
 64MB Ram
 2 Fast Narrow Barracuda's on the default NCR striped as swap.
 in single user mode
 running -current (a few days old)

Real time    :      4096       8192      12288      14336      16384
./dl                3.78      10.84      39.19     200.20     249.51
./FreeBSD           2.63       5.72      19.10      70.10      90.30
./GNU               0.96       3.62      12.25      45.38     174.29
./NetBSD          205.94     ENOMEM        N/A        N/A        N/A

User time    :      4096       8192      12288      14336      16384
./dl                0.26       0.49       1.11       1.54       1.80
./FreeBSD           0.31       0.62       1.22       1.79       1.94
./GNU               0.19       0.42       0.91       1.17       1.76
./NetBSD            0.56     ENOMEM        N/A        N/A        N/A

Sys time     :      4096       8192      12288      14336      16384
./dl                3.54      10.37      23.75      38.22      48.51
./FreeBSD           2.34       5.13      13.29      18.95      28.03
./GNU               0.79       3.22       9.57      15.31      22.65
./NetBSD           19.17     ENOMEM        N/A        N/A        N/A

Page reclaims:      4096       8192      12288      14336      16384
./dl            17905.00   35755.00   58763.00   62980.00   71972.50
./FreeBSD        7431.00   10129.00   19252.50   18711.50   24716.00
./GNU            2128.00    4213.00   11527.50   14751.50    9918.50
./NetBSD        12596.50     ENOMEM        N/A        N/A        N/A

Page faults  :      4096       8192      12288      14336      16384
./dl                0.00       0.00    2851.50   32119.50   37762.50
./FreeBSD           0.00       0.00     923.50   10772.50   12468.50
./GNU               0.00       0.00     300.00    5343.00   26242.00
./NetBSD        24412.00     ENOMEM        N/A        N/A        N/A

Vol ctx sw   :      4096       8192      12288      14336      16384
./dl                0.00       0.00    2831.00   32070.50   37712.00
./FreeBSD           1.00       0.00     919.50   10747.50   12437.00
./GNU               0.00       0.00     280.50    5313.00   26199.00
./NetBSD        24854.00     ENOMEM        N/A        N/A        N/A

On a i386 (a 233MHz K6) w/
  64MB Ram
  1 Ultra Wide Barracuda as swap
  in single user mode
  running NetBSD-1.3.1

Real time    :       512   1024   2048   4096      8192     16384
./FreeBSD           0.31   0.98   3.27  11.80     48.39    383.06
./standard          0.17   0.49   1.58   5.85    529.93       N/A
./GNU               0.17   0.20   0.62   2.08      7.79    496.51

User time    :       512   1024   2048   4096      8192     16384
./FreeBSD           0.04   0.05   0.10   0.24      0.60      1.37
./standard          0.02   0.04   0.03   0.11      0.34       N/A
./GNU               0.06   0.02   0.07   0.17      0.27      0.76

Sys time     :       512   1024   2048   4096      8192     16384
./FreeBSD           0.28   0.93   3.17  11.57     47.79     72.74
./standard          0.15   0.45   1.56   5.75     23.97       N/A
./GNU               0.12   0.19   0.56   1.92      7.53     27.29

Page reclaims:       512   1024   2048   4096      8192     16384
./FreeBSD           0.00   0.00   0.00   0.00      0.00      0.00
./standard          0.00   0.00   0.00   0.00      0.00       N/A
./GNU               0.00   0.00   0.00   0.00      0.00      0.00

Page faults  :       512   1024   2048   4096      8192     16384
./FreeBSD           8.00  11.00  13.00  22.00     52.00  15603.00
./standard          8.00   8.00   8.00   8.00  31618.00       N/A
./GNU               8.00   8.00   8.00   8.00      8.00  41813.00

Vol ctx sw   :       512   1024   2048   4096      8192     16384
./FreeBSD           5.00   6.00   5.00   5.00      5.00 798740.00
./standard          5.00   4.00   5.00   5.00 792082.00       N/A
./GNU               5.00   4.00   5.00   4.00      5.00  90716.00

On a shark (233MHz StrongArm) w/
 32MB Ram
 NFS over 10BaseT as swap (swap might've been broken, though)
 in single user mode
 running NetBSD 1.3H

Real time    :       512       1024       2048       4096
./dl                0.00       1.00       2.00       7.00
./standard          0.25       0.77       3.69     ENOMEM
./GNU               0.25       0.36       0.99       4.24
./FreeBSD           0.41       0.96       2.41       8.12

User time    :       512       1024       2048       4096
./dl                0.06       0.17       0.39       0.62
./standard          0.00       0.08       0.06     ENOMEM
./GNU               0.05       0.05       0.12       0.28
./FreeBSD           0.05       0.17       0.28       0.77

Sys time     :       512       1024       2048       4096
./dl                0.57       1.14       2.47       6.95
./standard          0.28       0.72       3.12     ENOMEM
./GNU               0.23       0.34       0.90       3.98
./FreeBSD           0.38       0.81       2.15       7.37

Page reclaims:       512       1024       2048       4096
./dl             4627.00    9174.00   18281.00   36483.00
./standard       1227.00    2397.00    4915.00     ENOMEM
./GNU            1122.00    1088.00    2119.00    4186.00
./FreeBSD        1845.00    3703.00    7416.00   14850.00

Page faults  :       512       1024       2048       4096
./dl                0.00       0.00       0.00       0.00
./standard          0.00       0.00       0.00     ENOMEM
./GNU               0.00       0.00       0.00       0.00
./FreeBSD           0.00       0.00       0.00       0.00

Vol ctx sw   :       512       1024       2048       4096
./dl               11.00      11.00      11.00      10.00
./standard         11.00      12.00      11.00     ENOMEM
./GNU              13.00      13.00      12.00      15.00
./FreeBSD          11.00      11.00      11.00      10.00


 == Roland Dowdeswell
 == http://www.imrryr.org/~elric/