Subject: Re: ptrace() vs. SIGKILL?
To: None <tech-security@netbsd.org, tech-kern@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 12/08/2002 18:27:15
[ On Sunday, December 8, 2002 at 08:59:34 (+0100), der Mouse wrote: ]
> Subject: Re: ptrace() vs. SIGKILL?
>
> > SunOS 5.6 sun4m
> 
> > p2 attaching to 20058
> > p2 PTRACE_ATTACH failed: No such process
> 
> How odd.  Looks as though their PT_ATTACH (the message needs fixing; I
> guess I fixed the code and forgot to fix the printf) doesn't work, or
> at least doesn't match the NetBSD documented behaviour.

Ah, NO!  :-)

The point is that no other AT&T-UNIX derrived variant has PT_ATTACH,
except SunOS-4 (if you consider it such a variant since it is in some
ways) which is where PT_ATTACH originated (from SunOS-4 ptrace(2) BUGS
section):

      The requests PTRACE_TRACEME through PTRACE_SINGLESTEP are
      standard UNIX system ptrace() requests.  The requests
      PTRACE_ATTACH through PTRACE_DUMPCORE and the fifth argu-
      ment, addr2, are unique to SunOS.

I don't know why NetBSD deviated from SunOS and called the flags PT_*,
nor why our documentation is missing a HISTORY section and some mention
of the other items from the SunOS-4 BUGS section.....

SunOS-5 has PT_ATTACH defined in the <sys/ptrace.h> header file (as
PTRACE_ATTACH, of course), but it's not implemented -- it's just there
because they thought they'd define all the old SunsOS-4 values for
posterity's sake.

	http://docs.sun.com/db/doc/805-3864/6j3lvpajn?a=view
	http://docs.sun.com/db/doc/805-6331/6j5vgg69p?a=view

As far as I know and can tell all native SunOS-5 debuggers have always
directly used /proc to attach to and control other processes.

> > Traditionally ptrace() didn't work for non-parent processes, and for
> > processes which were not started with the intention of being
> > processed, which is why there is a PT_TRACE_ME request in the first
> > place.  Now we have the ability to ptrace() unrelated processes using
> > PT_ATTACH, and that seems to be where this bug must have crept in.
> 
> I see no reason to think this has anything to do with PT_ATTACH, either
> in the current implementation or historically.

Well, except for fork-bombs it has everything to do with PT_ATTACH, and
PT_TRACE_ME :-)

That's because without PT*_ATTACH there's no way for an unrelated
process to arbitrarily take control over another process in such a way
that it'll be stopped on SIGKILL so it can be PT*_CONTINUE'd instead of
just exiting immediately like it should.

PT*_ATTACH is a hack which was added to try to alleviate one of the
inherent limitations in the original ptrace() mechanism.  The proof that
it's a hack, BTW,, and a rather poor one at that, includes the fact it
works by actually re-parenting the target process (and this re-parenting
is then undone by the PT_DETACH that's hopefully used if you're really
debugging some arbitrary process!).

I don't know if PTRACE_ATTACH was implemented correctly w.r.t. SIGKILL
in SunOS-4 though I doubt it (it may have been fixed in 4.1.1 or later
though -- I suppose I could try to find a system to test).

Without PT_ATTACH, and without being the parent of the process in
question (and without the ability to replace the program or some library
it loads), then there's no way to prevent SIGKILL from simply killing an
arbitrary target process.  This limits the applicability of this bug to
strictly co-operating processes with strict parent/child relationships
and even for a fork-bomb this might reduce its utility quite a bit.

That's not to say that a similar SIGKILL bug doesn't lurk in SunOS-5's
/proc -- it may well do.  There were certainly lots of other security
bugs, especially related to debugger support, in the original AT&T /proc
as well as in most re-implementations, such as Linux and IIRC even *BSD.

> I've built a variant that doesn't depend on PT_ATTACH, for you to try
> on a pre-PT_ATTACH system if/when you get a chance to.  Since I see no
> particular value in the version with PT_ATTACH vs this one, I've just
> replaced ftp.netbsd.org:/pub/NetBSD/misc/mouse/sigkill.c with the new
> version.  (I still see the traced-continue-after-kill behaviour.)

If you call ptrace(0) in the child processes then your program will
indeed demonstrate if an implementation allows SIGKILL'ed process to be
restarted.

> You're missing something: a process can be traced by at most one other
> process.  This prevents the formation of the sort of heavily redundant
> mesh you seem to be thinking of.

Well, I'm not really missing that -- but I did get side-tracked a bit.
What I was really thinking of was not a full mess per se, but rather
just a many-spined star formation with multiple children linked out on
each spine (one process can watch many children) and if in your dirty
business you can afford to loose the odd child then each of the end
children can have a thread doing a waitpid() on the core process and one
of them can replace it as necessary.

I suppose on an older system one could simply form many stars and
whenever any parent is successfully killed just start a new star from
that child again....  Even just many short strings of processes could
probably perpetuate much longer if they can restart their SIGKILL'ed
children....

>  I see another possibility here which
> I think is serious enough that I won't mention it on a public mailing
> list; I'm going to send it to security-officer instead, because I see
> fairly nasty potential for abuse.

I'd appreciate a CC to <woods@planix.com> if you don't mind....

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>