NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/50430: syscall_disestablish() can remove active syscalls

>Number:         50430
>Category:       kern
>Synopsis:       syscall_disestablish() can remove active syscalls
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Nov 15 10:30:00 +0000 2015
>Release:        NetBSD 7.99.21
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at    |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at  |
System: NetBSD 7.99.21 NetBSD 7.99.21 (POKEY 2015-11-14 05:10:46) #0: Sat Nov 14 17:06:13 PHT 2015 amd64
Architecture: x86_64
Machine: amd64
syscall_disestablish() checks only the last syscall (stored in struct lwp
member l_sysent) when checking to see if a syscall is still active.  In
most cases this works fine.

But if the syscall was stalled or blocked for some reason (waiting for an
I/O device for example), and a signal is subsequently delivered to the
lwp, the lwp's signal handler can call another syscall prior to return
from the first one.  In the case, the new syscall overwrites the l_sysent
member and there is no record of the original syscall, other than in the
process's stack.

If the original syscall is provided by a dynamically-loaded module, and
that module is now requested to be unloaded, syscall_disestablish() will
not be able to return EBUSY, so the module will be unloaded.  When the
original syscall is resumed, it will execute at an address which will no
longer contain the syscall code, likely resulting in some sort of trap.

On the other hand, it is also possible for syscall_disestablish() to
return a "false positive" result, indicating that a syscall is still
active when in fact it has been deactivated (consider an application
that calls logjmp(3) to return to a procedure beyond/above the one that
called the syscall).  In such case, the l_sysent member will still
contain an entry for a syscall that will never return.

See above.
Unknown.  Perhaps the l_sysent member should be replaced with a "stack"
of entries and an associated stack-pointer, so that all nested syscalls
can be recorded.  But what would be the "right" size for the stack?  And
would we really want to provide for re-sizing it if needed?

If the stack approach were taken, and if the stack were re-sizable, that
could provide an opportunity for a malicious program to consume large
amounts of kernel memory, possibly resulting in a denial-of-service.  :(



Home | Main Index | Thread Index | Old Index