Subject: Re: problem with tmpfs and linux emulation?
To: None <tls@rek.tjls.com>
From: Steven M. Bellovin <smb@cs.columbia.edu>
List: current-users
Date: 11/12/2005 18:06:59
In message <20051112223541.GA15543@panix.com>, Thor Lancelot Simon writes:
>On Sun, Nov 13, 2005 at 06:17:19AM +0900, YAMAMOTO Takashi wrote:
>> > > Linux libc doesn't use read(2) to read directories, it uses
>> > > getdents64().
>> > 
>> > "It depends".  There were discussions on linux kernel lists as little
>> > as a year ago about how it would be _nice_ to be able to assume that
>> > libc always used getdents64() to avoid supporting lseek() on directories.
>> > 
>> > Thor
>> 
>> are you sure?
>> iirc, linux doesn't support read(2) on directories, even for ext2.
>
>No, it looks like I was mistaken.  I seem to have misunderstood a
>discussion of glibc mixing together calls to getdents and getdents64
>which resulted in calling lseek with invalid offsets that the kernel
>tried to translate in some complicated way.
>
>I've definitely debugged a version of libc that used read on directories
>but it could have been as long as 10 years ago, now that I think of it.
>
OK.  But we're back where I started from: openoffice2 behaves 
differently with tmpfs-resident directories.  We don't know why.  We do 
know, from my test program, that if a process tries to read a directory 
via read(2), its behavior is not the same as on Linux *except* for 
tmpfs....

I ran ktrace during a failure scenario, involving /mnt (*not* /tmp) 
mounted as tmpfs.  See http://www.machshav.com/~smb/soffice_tmpfs.txt
I fired up openoffice2, let it quiesce, invoked ktrace on the process 
that seemed to be the one accumulating cpu time during open requests, 
typed ^O, typed /mnt/ into the dialog box, then hit ENTER.
The same sequence, using an mfs-mounted /mnt, is in
http://www.machshav.com/~smb/soffice_mfs.txt.  A quick comparison of 
the two shows this for tmpfs:

  5172 soffice.bin CALL  open(0xbfbf9fc8,0x18800,0xbaaae010)
  5172 soffice.bin NAMI  "/emul/linux/mnt"
  5172 soffice.bin NAMI  "/mnt"
  5172 soffice.bin RET   open 24/0x18
  5172 soffice.bin CALL  fstat64(0x18,0xbfbf9f4c)
  5172 soffice.bin RET   fstat64 0
  5172 soffice.bin CALL  fcntl64(0x18,2,1)
  5172 soffice.bin RET   fcntl64 0
  5172 soffice.bin CALL  getdents64(0x18,0x8188888,0x1000)
  5172 soffice.bin RET   getdents64 76/0x4c
  5172 soffice.bin CALL  llseek(0x18,0,1,0xbfbfb120,0)
  5172 soffice.bin RET   llseek 0
  5172 soffice.bin CALL  getdents64(0x18,0x8188888,0x1000)
  5172 soffice.bin RET   getdents64 52/0x34
  5172 soffice.bin CALL  close(0x18)

but this for mfs:

  6975 soffice.bin CALL  open(0xbfbf9fc8,0x18800,0xbaaae010)
  6975 soffice.bin NAMI  "/emul/linux/mnt"
  6975 soffice.bin NAMI  "/mnt"
  6975 soffice.bin RET   open 26/0x1a
  6975 soffice.bin CALL  fstat64(0x1a,0xbfbf9f4c)
  6975 soffice.bin RET   fstat64 0
  6975 soffice.bin CALL  fcntl64(0x1a,2,1)
  6975 soffice.bin RET   fcntl64 0
  6975 soffice.bin CALL  getdents64(0x1a,0x818ebd0,0x4000)
  6975 soffice.bin RET   getdents64 76/0x4c
  6975 soffice.bin CALL  lstat64(0xb6fe5c18,0xbfbfae70)
  6975 soffice.bin NAMI  "/emul/linux/mnt/foo.txt"
  6975 soffice.bin NAMI  "/mnt/foo.txt"
  6975 soffice.bin RET   lstat64 0
  6975 soffice.bin CALL  getuid
  6975 soffice.bin RET   getuid 54047/0xd31f
  6975 soffice.bin CALL  getgid
  6975 soffice.bin RET   getgid 54047/0xd31f
  6975 soffice.bin CALL  open(0xbaaa309c,0,0xbaaae010)
  6975 soffice.bin NAMI  "/emul/linux/proc/sys/kernel/ngroups_max"
  6975 soffice.bin NAMI  "/proc/sys/kernel/ngroups_max"
  6975 soffice.bin RET   open -1 errno -2 No such file or directory
  6975 soffice.bin CALL  getgroups(0x20,0xbfbfaf6c)
  6975 soffice.bin RET   getgroups 4

In other words, both are doing getdents (which also resolves the 
earlier issue), but the tmpfs instance is following that with an 
llseek, while the mfs instance is diong lstat64.  It seems that it does 
not like whatever it got back from getdents64.  I don't have a copy of 
Linux readdir(3) around to see what would induce it to do llseeks -- 
assuming, of course, that those are coming from the library and not 
openoffice2 -- but that's where I'd look for the problem.

I'll append this note to my PR.

		--Steven M. Bellovin, http://www.cs.columbia.edu/~smb