Subject: Re: CRITICAL ** Holes in default cron jobs ** CRITICAL
To: Warner Losh , Matt Thomas <>
From: Stefan Grefen <>
List: tech-kern
Date: 01/02/1997 10:55:09
In message <>  Warner Losh wrote:
> In message <> Matt Thomas writes:
> : Acutally, a 
> : 
> : int unlink2(const char *name, const struct stat *statbuf);
> : 
> : would solve the problem.  In essence, you stat/fstat the file first (which
> : you are going to do anyway (to make sure it's on the device, old enough,
> : etc.)) and then you pass that stat buf to unlink2.  The kernel can then
> : verify that <name> is the same object as represented by the information
> : in *<statbuf> and then proceed with the deletion.  If the information
> : (dev,inode,generation) doesn't match, unlink2 fails.  The kernel can easily
> : make this an atomic operation.
> You still have the race here.  Between the readdir() and the stat(),
> the file can change out from under you, and then you go ahead and
> delete the wrong thing because the stat info matches :-(.

This race can be detected in usermode by checking the directory's st_mtimespec,
st_ctimespec and the files inode-number against the one in the stat-buffer.
If the inode-number matches and the timespecs of the directory didn't change
it's still the same file. If not rescan the directory. 

A more convenient way to do that, would be to return the inode-generationnumber
in the result of readdir().

This kind of race is only dangerous for operations that modify things,
read-only commands like stat can be retried.

I think the unlink2 call solves the problem for the case at hand.

For more general solutions a good starting point would be to look at the
DMAPI standard-proposal (Data Management API, former DMIG) because the have
to solve basicly the same problems in a more nasty environment. 

The goal should be to enhance the definition of an inode (which is included
in the readdir() result) to uniqly define a file (on the disk AND in the
This can easily be done by adding the msec time at the time of creation to the
inode (plus workarounds for reboots, broken clocks etc.) and a machine or
disk ID to avoid problems moving disks, between machines.

Than add operations that can work on this handles (either
by using a a NOIO open or be different/additional arguments to existing
A very elegant solution would be a unique-inode-filesystem. 
You could even use sh,ls and rm than to securely remove files:
(lets assume -I means display unique-inode).

# ls -lI  /tmp/xx
BD452981764399A645 -rw-r--r--   1 grefen  wheel     1040 Dec 25 23:11 xx
# rm /uinofs/BD452981764399A645

This would only remove the file /tmp/xx which existed at the time of the ls.
Regardless if it is issued immediately or a year after the ls.


> Warner

Stefan Grefen                                Tandem Computers Europe Inc.                       High Performance Research Center
You should never bet against anything in science at odds of more than
about 10^12 to 1.
                -- Ernest Rutherford