Subject: Bittorrent package and 'broken libc' claim
To: None <tech-userlevel@netbsd.org>
From: Greg Troxel <gdt@NetBSD.org>
List: tech-userlevel
Date: 07/27/2005 20:29:43
pkgsrc/net/bittorrent 4.0.2, on NetBSD/i386 2.99.15, has abysmal
performance on startup when it computes SHA-1 hashes over "pieces"
that are already in the local filesystem, specifically on the order of
15 minutes to process 400 MB (in my application, basically iso image
release contents with pkgs).

btdownloadheadless.py has an option:

--enable_bad_libc_workaround <arg>
          enable workaround for a bug in BSD libc that makes file
          reads very slow. (defaults to 0)

Without this option, files are opened like this:

	self.handles[filename] = file(filename, 'rb', 0)

where the python manual says:

   The optional bufsize argument specifies the file's desired buffer
   size: 0 means unbuffered, 1 means line buffered, any other positive
   value means use a buffer of (approximately) that size. A negative
   bufsize means to use the system default, which is usually line
   buffered for tty devices and fully buffered for other files. If
   omitted, the system default is used.

I believe that then reads are done, probably not at size 1.

Enabling the "workaround" redefines file:

def bad_libc_workaround():
    global file
    def file(name, mode = 'r', buffering = None):
        return open(name, mode)

to call the (equivalent) open function, letting the buffering argument
of the open call imply system default buffering.  Sure enough, ktrace
shows single-character systems calls, just like I would expect:

 27867 python2.3 CALL  read(6,0xbda52b83,1)
 27867 python2.3 GIO   fd 6 read 1 bytes
       "Q"
 27867 python2.3 RET   read 1

With the "workaround" enabled, performance is reasonable, and ktrace
shows:

  6633 python2.3 CALL  read(0x22,0x8359000,0x2000)
  6633 python2.3 GIO   fd 34 read 4088 bytes
  6633 python2.3 GIO   fd 34 read 4088 bytes
  6633 python2.3 GIO   fd 34 read 16 bytes
  6633 python2.3 RET   read 8192/0x2000

This is mysterious, but not a huge performance problem.

It seems to be that our libc is just fine - asking stdio for
unbuffered (which means bufsize 1, it seems) results in that behavior.

Fairly obviously bittorrent has reasonable performance on Linux, or
they'd change this odd bit of code to use larger buffer sizes.  Does
anyone know if setting buffer sizes to really small values on Linux
causes read system calls to be done with the amount of requested data
for larger stdio reads?  Further, would this be encouraged and/or
prohibited by any standards?


For now, I'm inclined to patch the bittorrent pkg to enable the
"workaround" (which I would characterize as "let stdio use reasonable
buffer sizes and don't tell it to use a size of 1").