Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race condition not handled

To: kern-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,Wolfgang.Stukenbrock%nagler-company.com@localhost
Subject: Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race condition not handled
From: Wolfgang Stukenbrock <wolfgang.stukenbrock%nagler-company.com@localhost>
Date: Fri, 21 Sep 2012 12:50:26 +0000 (UTC)
The following reply was made to PR kern/46990; it has been noted by GNATS.

From: Wolfgang Stukenbrock <wolfgang.stukenbrock%nagler-company.com@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: kern-bug-people%NetBSD.org@localhost, gnats-admin%NetBSD.org@localhost, 
netbsd-bugs%NetBSD.org@localhost
Subject: Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race 
condition not handled
Date: Fri, 21 Sep 2012 14:48:12 +0200

 Hi again,
 
 I' ve bootet my testsystem again and enabled DIAG-mode
 
 I've got the following crash:
 
 tttt# ./dirconc3 /tmp > out-a ; ./cl*
 panic: WARNING: SPL NOT LOWERED ON SYSCALL 66 -9984 EXIT 8b 6
 uvm_fault(0xffff8000885ec020, 0x0, 1) -> e
 pool_get: pnbufpl: curpage NULL, nitems 4
 pool_get: pnbufpl: curpage NULL, nitems 4
 fatal page faultpool_get: pnbufpl: curpage NULL, nitems 4
   in supervisor mode
 kernel diagnostic assertion "!tmpfs_rmdired_p(vp)" failed: file 
 "../../../../fs/tmpfs/tmpfs_rename.c", line 528pool_get: pnbufpl: 
 curpage NULL, nitems 4
 
 trap type 6 code 0 rip ffffffff801ff524 cs 8 rflags 10246 cr2  10 cpl 0 
 rsp ffff80008852ba90
 kfaertanlel :br epaagkepo ifanutl ttr tarp aipn,  scoudpee=rv0
 sStopped in pid 19033.1 (dirconc3) at    netbsd:tmpfs_dir_detach+0xb4: 
   movq    0
 x10(%rdx),%r13
 db{6}> bt
 tmpfs_dir_detach() at netbsd:tmpfs_dir_detach+0xb4
 tmpfs_rmdir() at netbsd:tmpfs_rmdir+0x238
 VOP_RMDIR() at netbsd:VOP_RMDIR+0x69
 sys_rmdir() at netbsd:sys_rmdir+0xd6
 syscall() at netbsd:syscall+0xb9
 db{6}>
 
 It looks like that there are at least two crashes at once. (First time 
 this happens here.)
 
 The failed assertion is in side of the loop and tell us, that we would 
 fetch a NULL parent pointer as dnode in the next line ...
 
 I've added two printf statements in tmpfs_mem.c when freeing a dirent or 
 node and the node/vnode pointer is not NULL. Not printed on console.
 
 I've tried to get the backtrace of all process that are marked with a 
 ">" in ps output - I think theese are on CPU
 (4A7B 4F6B 4D60 4F5C 4F55 4F51 4A1D 4A3B)
 
 
 db{6}>  bt/t 4A7B
 trace: pid 19067 lid 1 at 0xffff8000961cf870
 panic() at netbsd:panic+0xab
 pool_get() at netbsd:pool_get+0x333
 pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x1cc
 pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x174
 namei() at netbsd:namei+0xe9d
 sys_rmdir() at netbsd:sys_rmdir+0x51
 syscall() at netbsd:syscall+0xb9
 
 db{6}> bt/t 4F6B
 trace: pid 20331 lid 1 at 0xffff80009614b8b0
 ?() at 0xffff80017761d000
 
 here the system is gone ... - reset required - sorry
 
 best regards
 
 W. Stukenbrock
 
 Wolfgang Stukenbrock wrote:
 
 > The following reply was made to PR kern/46990; it has been noted by GNATS.
 > 
 > From: Wolfgang Stukenbrock 
 > <wolfgang.stukenbrock%nagler-company.com@localhost>
 > To: Taylor R Campbell <riastradh%NetBSD.org@localhost>
 > Cc: gnats-bugs%NetBSD.org@localhost
 > Subject: Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race 
 > condition not handled
 > Date: Fri, 21 Sep 2012 13:10:26 +0200
 > 
 >  This is a multi-part message in MIME format.
 >  --------------030801010402050202010402
 >  Content-Type: text/plain; charset=us-ascii; format=flowed
 >  Content-Transfer-Encoding: 7bit
 >  
 >  Hi, of cause you can have them.
 >  
 >  I've already ask if someone is interested in - see PR46955.
 >  
 >  The attached tar file contains all modified files.
 >  It is mainly the version from Head with some "missing" stuff copied in 
 >  from kernel_auth and genfs directly into the tmpfs-sources.
 >  I don't want to upgrade all of other parts too, because then I have to 
 >  change everything ...
 >  I don't expect to get the "best integration of the year"-award for this 
 >  work. I simply need a stable version for our productive systems running 
 >  5.1.x - and currently they lockup in tmpfs from time to time.
 >  (remark: as far as I remember, one kauth-change is not marked with "#if 
 >  0 // not in 5.1" in the lookup code - a diff against the version from 
 >  head will show it.)
 >  
 >  now to the problem itself.
 >  
 >  OK - you are right.
 >  tmpfs_gro_lock_directory will check the parent link - I've overlooked 
 >  that sorry.
 >  
 >  But I've added a printf() statement in my new error handling path when 
 >  dnode is NULL and I've got it!
 >  Also the analyses with DDB before shows, that dnode is NULL. The 
 >  (manualy) reconstructed call path (accedently mutex_enter is an 
 >  assembler stuff that does not have a frame pointer ...) say that 
 >  VOP_RENAME calls genfs_insane_rename, that calls tmpfs_sane_rename, that 
 >  calls genfs_sane_rename that calls the function from ops at offset 0x40 
 >  (this is tmpfs_gro_genealogy - not shown in the backtrace of ddb due to 
 >  the missing frame pointer in mutex_enter) and that crashes inside the 
 >  call mutex_enter(&dnode->tn_vlock).
 >  I cannot say if it is the first loop or the second one - I assumed the 
 >  first one in my report ... My fault.
 >  
 >  If we can exclude the first loop pass, than the "problem" must happen 
 >  while walking up the parent chain.
 >  We will vput(vp) after getting tn_vlock on the parent - so we have no 
 >  lock on any vnode anymore.
 >  The normal way is to get tn_vlock and than seaches for the vnode by 
 >  calling tmpfs_vnode_get(). There we have two different ways to handle 
 >  the locks:
 >  - If the vnode is present we get the v_interlock on it, release tn_vlock 
 >  and call vget().
 >  - If the vnode is missing, we allocate one, call vn_lock on it and than 
 >  release the tn_vlock.
 >  The second one is OK in any case, but not used in our case here - I think.
 >  If the vnode is "present", than another process may have locked that 
 >  vnode and is e.g. trying to get to "our" vnode. When we release it, this 
 >  process will get it. At this time we "only" hold the tn_vlock on the parent.
 >  I'm not shure if there is a possibility that other processes may kill 
 >  the old directory vnode and than kill the one where we have the tn_vlock 
 >  too.
 >  The only location where tn_parent gets killed is in tmpfs_dir_detach(). 
 >  and that one is called in tmpfs_gro_rename (nobody may come there - 
 >  rename-lock in vfs), tmpfs_gro_remove (not used for directories) and 
 >  tmpfs_rmdir.
 >  
 >  OK - it is always possible that somewhere else someone is throwing 
 >  bombs. I've "only" tested tmpfs with dirconc, so in case of bombs they 
 >  may be there but still not explode. but I don't think so.
 >  
 >  By the way:
 >  What will happen if I've a reference to a directory (e.g. current-dir) 
 >  and someone delete it and the parent of it too?
 >  When a directory is deleted, it is either realy deleted 
 >  (tmpfs_free_dirent) or set to WHITEOUT-state (tmpfs_dir_attach again).
 >  I've seen no code that will destroy the link between the vnode and the 
 >  node structure till now, but I assume it is present somewhere.
 >  As a second action all WHITEOUT children of the just removed directory 
 >  are simply deleted during directory delete operation. (calls to 
 >  tmpfs_dir_detach and tmpfs_free_dirent without any checks for vnode 
 >  references.)
 >  If there are still vnodes associated to the nodes in the dirents, is 
 >  this code save?
 >  I'm not shure, but I've no time to go deeper there.
 >  Perhaps the process is suspended so long, that the dirent-structure gets 
 >  reuses and removed (as WHITEOUT) before it continues ...
 >  (I'm running dirconc ....)
 >  
 >  The system where I've tested my patch is spare at the moment. So I can 
 >  do other tests if desired.
 >  I will enable KASSERT() and will try to get out more information.
 >  Accedently enabling all KASSERT in the kernel will change the timing 
 >  behaviour in a way, that the problem may not come up again.
 >  
 >  Just as information:
 >  The last problem I've fixed before was the sync with the cache after 
 >  rename occured.
 >  In Head the call to genfs_rename_cache_purge() was commented out in 
 >  tmpfs_gro_rename.
 >  5.1 seems to need it, because I've ended up with having the object in 
 >  the correct tmpfs-structure, but when accessing it in the cache it is 
 >  not found.
 >  Symtom: "ls <dir>" will show the name but "ls -l <dir>" say "file not 
 >  found".
 >  Either this is wrong in Head too or there are some changes in the 
 >  vnode-cache, that I've overlooked. Neverless it should be possible all 
 >  the time to kill some "random" entries from the cache. So adding removal 
 >  shouldn't do anything wrong - perhaps I loose some performance, but I 
 >  can live with that.
 >  But I don't think this is relavant for this problem.
 >  
 >  If you prefer that I try it in Head instead, let me know the location of 
 >  the source-tar-files on an ftp-server I should use for installation. I 
 >  can (cross-)compile them on another system and install the resulting 
 >  binaries via network.
 >  
 >  Best regards
 >  
 >  W. Stukenbrock
 >  
 >  Taylor R Campbell wrote:
 >  
 >  >    Date: Thu, 20 Sep 2012 14:50:01 +0000
 >  >    From: Wolfgang.Stukenbrock%nagler-company.com@localhost
 >  > 
 >  >    The comment in tmpfs_rename.c tmpfs_gro_genealogy() that dnode may
 >  >    not be NULL is simply not true. Need to check if dnode is NULL and
 >  >    return an error (e.g. ENOENT) instead.
 >  > 
 >  > I'm not sure this analysis is correct.  If tmpfs_rmdired_p(vp) is
 >  > false, then VP_TO_TMPFS_NODE(vp)->tn_spec.tn_dir.tn_parent is
 >  > guaranteed to be nonnull, provided that vp is locked.  Can you enable
 >  > DIAGNOSTIC and see what earlier kassert fires?  Also, have you
 >  > reproduced the problem in HEAD too?
 >  > 
 >  > I suspect that this not a rename locking issue in particular so much
 >  > as tmpfs locking issue in general -- if I recall correctly, after
 >  > netbsd-5, rmind rototilled the tmpfs locking some time before I
 >  > committed changes to fix tmpfs_rename last year.
 >  > 
 >  > Can we see your patches to back-port tmpfs_rename?
 >  >
Prev by Date: Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race condition not handled
Next by Date: Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race condition not handled
Previous by Thread: Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race condition not handled
Next by Thread: Re: kern/46990: tmpfs_gro_genealogy() panic - rename/rmdir race condition not handled
Indexes:
Home | Main Index | Thread Index | Old Index