Subject: Re: core file name format: diffs
To: None <tech-kern@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: tech-kern
Date: 09/22/1999 16:11:53
On Tue, Sep 21, 1999 at 12:59:04PM -0400, der Mouse wrote:
> >>> The basename of the per-process corename format have to be either
> >>> 'core' or end with '.core' [...]
> >> I'm not at all sure I like this; I think [sysctl] calls made by root
> >> should be exempt from this,
> > Easy to change
> 
> True enough.  (It also fits in with the "don't protect root from
> itself" philosophy that applies to things like unlink(2) on directories
> and "dd if=/dev/zero of=/dev/kmem".)

securelevel prevent you from this mistacke :)

> 
> >> and personally would argue an option should exist to disable the
> >> check entirely.
> > I'm not sure adding yet another kernel option or sysctl variable for
> > this is the rigth thing to do.
> 
> I'm not sure either.  When I wrote "option", I had no particular
> meaning in mind, merely that some way should be there (some relatively
> convenient way; one can always argue that patching the source code is
> an available way to get this behavior :-).
> 
> > We should really change the semantic of securelevel to a mask-based
> > one.
> 
> Why?  What does a mask-style securelevel buy you over a bunch of
> boolean kern.<foo> sysctls?  Nothing I can see, except that they can be
> stored more compactly in the kernel - and that can be addressed with a
> new sysctl type "boolean", with a sysctl_boolean() implementation that
> bashes on individual bits in a mask.

The display is more compact too.
I've been several times in situation where I'd like to have some of the
protections securelevel=2 provides, but not all.
As we add more and more stuffs conditionned by securelevel I guess this will
becore more and more an issue.

> 
> In fact, you could provide both, kern.security.all for the mask so you
> can load all the bits at once and kern.security.lkm for LKMs,
> kern.security.mount for mounting disks, etc, for twiddling them
> individually.

Good idea !

> 
> I'm not quite sure what this would do to the "raise only" semantics of
> the current securelevel.  Perhaps there'd be a kern.security.insecure
> bit that, if clear, means you can set but not clear the others - and
> cannot itself be set except by init.

Actually I'd call it kern.security.secure and have it semantic reversed
(you have to set all others to get more security, but to clear this one)
but this is what I had in mind.

> 
> >>> +.It Li p	The PID of the process
> >>> +.It Li t	The process's creation date
> 
> >> It would be good to state how %p and %t are rendered into text
> >> (primarily for completeness).
> 
> > What about this ? :
> > .It Li p        The PID of the process (integer)
> > .It Li t        The process's creation date (in seconds)
> 
> I'd make it more like
> 
> .It Li p        The PID of the process (in decimal)
> .It Li t        The process's creation date (a la
> .Xr time 3 ,
> in decimal)

Ok, changed. Thanks.

> 
> > The fifth level name is one of PROC_PID_LIMIT_TYPE_SOFT or
> > PROC_PID_LIMIT_TYPE_HARD, to select respectively the soft or hard limit.
> > Both are of type integer.                           
> 
> Yeah, that fixes that entirely.
> 
> >>> +	if ((indx = findname(string, "5th", bufpp, lp)) == -1)
> >> Shouldn't that be "fifth", for consistency?
> > When I wrote this I didn't know how to spell 'fifth' :)
> 
> Oh. :-)
> 
> > +void
> > +p_sugid(p)
> > +	struct proc *p;
> > +{
> > +	extern char defcorename[];
> > +
> > +	p->p_flag |= P_SUGID;
> > +	/* reset what needs to be reset in plimit */
> > +	if (p->p_limit->pl_corename != defcorename) {
> > +		if (p->p_limit->p_refcnt > 1) {
> > +			p->p_limit->p_refcnt--;
> > +			p->p_limit = limcopy(p->p_limit);
> > +		}
> > +		p->p_limit->pl_corename = defcorename;
> > +	}
> > +}
> 
> I believe this will leak memory under some circumstances.
> 
> (1) Process sets its per-process corename.  Memory is allocated.
> (2) Process execs a set-id executable.  p_sugid() is called.
> (3) p_refcnt is 1 because nobody else shares this limit.
> (4) pl_corename is summarily bashed, dropping the only pointer to the
>      mallocked string.
> (5) On exit, pl_corename equals defcorename, so it's not freed.

Ops, yes, there's a missing else free(p->p_limit->pl_corename) ! Thanks !

> 
> Given how limcopy() works, I think there is also a use-after-free bug:
> 
> (1) Process sets its per-process corename.  Memory is allocated.
> (2) Process forks.  p_refcnt goes to 2.
> (3) One of the two processes sets some other limit.  The limit
>      structure splits into two, each with p_refcnt 1.  Both have
>      identical pl_corename pointers (see limcopy()).
> (4) One process exits.  Its p_refcnt is 1 and its pl_corename is not
>      equal to defcorename, so pl_corename is freed.
> (5a) The other process dumps core.  It follows its pl_corename pointer
>      to find...whatever currently happens to be lying around there.
> (5b) The other process exits.  Its p_refcnt is 1 and its pl_corename is
>      not equal to defcorename, so pl_corename is freed...again.

Arg, you're rigth. I don't know what's the best here:
a) add a refcount for pl_corename
b) always do a malloc() in limcopy() if pl_corename != defcorename

a) will consume 4 extra bytes for each instance of limcopy(), and need an
   extra increment of each limcopy() call
b) may consume more memory and time, but only if the user changed pl_corename

I think, on average, b) will be better.

For now, limcopy() is called each time a user does a setrlimit or a
sysctl -w, even if the data are the same. There's room for improvement
here, which will be efficient in some common case (setting limits in
a .cshrc/.profile).

Thanks for the feedback !

--
Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
--