Re: vm.bufcache and git

To: netbsd-users%netbsd.org@localhost
Subject: Re: vm.bufcache and git
From: Greg Troxel <gdt%ir.bbn.com@localhost>
Date: Wed, 26 May 2010 13:42:13 -0400

Increasing kern.maxvnodes indeed helps; obviously that's about not
having to recycle vnodes per stat.  Since the basic issue is that stat
latency is greater for 180K files than a few, normal microbenchmarks
don't help.

I wrote a short program to run stat on many files, and to loop on
subsets.  It outputs data in xplot format, plotting stat latency in us
vs log2 of number of files.  To use it, create FILES containing one
filename per line (I used "find . -type f | egrep -v /.git", and yes I
know I should have quoted the . ...) and then

  statbench < FILES

On a system with big enough bufcache and maxvnodes, I found stat to take
about 4 us up to about 32 files, and then it starts to grow, reaching
9-10 us around 4096-8192 files, and then it remains relatively constant.

Attachment: stat.png
Description: PNG image

/* $Id: statbench.c,v 1.3 2010/05/26 17:18:00 gdt Exp $ */

#include <assert.h>
#include <math.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/time.h>

#define MAXFILENAMES 200000

const char *filenames[MAXFILENAMES];
int nfilenames = 0;

do_run(int nstats)
{
  struct timeval before, after, delta;
  struct stat sb;
  int i, niters, n;
  double us_per_stat;

  assert (n <= nfilenames);

  gettimeofday(&before, NULL);
  niters = 1000000 / nstats;
  for (n = 0; n < niters; n++)
    for (i = 0; i < nstats; i++) {
      stat(filenames[i], &sb);
    }
  gettimeofday(&after, NULL);

  timersub(&after, &before, &delta);

  us_per_stat = delta.tv_sec * 1000000 + delta.tv_usec; /* Force to FP. */
  us_per_stat /= niters * nstats;

  printf("box %f %f\n", log2(nstats), us_per_stat);
  printf("; nstats %d  iterations %d total %d\n",
         nstats, niters, niters * nstats);
}

main()
{
  char input[1024];
  int nstats;

  /* Read all input, and just assert if too much or not enough. */
  while (fgets(input, sizeof(input), stdin) != NULL) {
    assert(nfilenames < MAXFILENAMES - 1);
    filenames[nfilenames++] = strdup(input);
  }
  assert(nfilenames >= 1);

  printf("double double\n");
  printf("title\nstat latency (us) vs log2 of number of files\n");
  printf("invisible 0 0\n");

  /*
   * nstats equal to nfilenames would be ok, but we use < to avoid
   * nfilenames twice in the case of nfilenames being exactly a power
   * of 2.
   */
  for (nstats = 1; nstats < nfilenames; nstats *= 2)
    do_run(nstats);
  do_run(nfilenames);
}

Attachment: pgp9N62q8awA2.pgp
Description: PGP signature

Follow-Ups:
- Re: vm.bufcache and git
  - From: Adam Hamsik

References:
- vm.bufcache and git
  - From: Greg Troxel
- Re: vm.bufcache and git
  - From: Greg Troxel

Prev by Date: Re: [C++] Thread-local storage segfaults on NetBSD only?
Next by Date: NetBSD Beginners question
Previous by Thread: Re: vm.bufcache and git
Next by Thread: Re: vm.bufcache and git
Indexes:

Home | Main Index | Thread Index | Old Index