NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/45736: setting kern.maxvnodes may lock up



>Number:         45736
>Category:       kern
>Synopsis:       setting kern.maxvnodes may lock up
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Dec 23 19:45:00 +0000 2011
>Originator:     Michael van Elst
>Release:        NetBSD 5.99.58
>Organization:
-- 
                                Michael van Elst
Internet: mlelstv%serpens.de@localhost
                                "A potential Snark may lurk in every tree."
>Environment:
        
        
System: NetBSD dummy 5.99.58 NetBSD 5.99.58 (GENERIC) #3: Thu Dec 15 15:23:35 
CET 2011 
mlelstv@henery:/home/netbsd-current/obj.amiga/home/netbsd-current/src/sys/arch/amiga/compile/GENERIC
 amiga
Architecture: m68k
Machine: amiga
>Description:
Running the program /usr/tests/sbin/syctl/t_perm as root locks up the
system. The reason is that the test program tries to set many sysctl
MIB entries to the value 3735928559 (0xDEADBEEF). During a regular test
run, this is executed with an unprivileged account, but when you start
the single test program as root, the sysctl commands are permitted.

The system freezes when executing

sysctl -w kern.maxvnodes=3735928559

A backtrace from DDB shows:

| db> bt
| cpu_Debugger(6000,1020,4,c72e800,188362) + 6
| serintr(1020,4,40af68,ea53d34,416a2) + 4a
| sermint(0) + 12
| vbl_handler(0,0,4,ea53f00,4) + 22
| intrhand(2300) + 162
| lev1intr(?)
| hashinit(deadbeef,0,1,ea53db4,19998) + 20
| union_reinit(40ce74,ea53de4,ea53ed0,1dffe98c,ea53e48) + 18
| vfs_reinit(0,e93c5a0,ea53ed0,1000272,5) + 3e
| 
sysctl_kern_maxvnodes(ea53ed8,0,1dffec30,ea53f00,1dffec34,4,ea53ed0,e93c5a0,20c6
| 180) + aa
| 
sysctl_dispatch(ea53ed0,2,1dffec30,ea53f00,1dffec34,4,ea53ed0,e93c5a0,20c6180,1)
 + 90
| sys___sysctl(e93c5a0,ea53f30,ea53f70,5,0) + b8
| syscall_plain(ca,e93c5a0,ea53fb4,8,1dffec30) + 92
| syscall(ca) + 74
| trap0() + e

which corresponds to the following line in hashinit():

        for (hashsize = 1; hashsize < elements; hashsize <<= 1)
                        continue;

elements is u_int, hashsize is u_long. Both are 32bit values
on this platform. This loop is infinite when elements is a
value between 2^31 and 2^32.

hashinit() is called with the value desired_vnodes, which is a
signed integer. I.e. 0xDEADBEEF is here a negative value.

The negative value can occur because sysctl_kern_maxvnodes doesn't
sanity check the input from sysctl. An unsigned value is passed
down to the kernel and interpreted as signed.

The negative value is compared to the old (also signed) value,
since it is smaller, the function vfs_drainvnodes() is called.
This function would error out with EBUSY when you cannot reduce
the number of allocated vnodes to the desired target, which
would cause sysctl_kern_maxvnodes to restore the old target.

However, vfs_drainvnodes which uses a 'long' instead of 'int'
parameter as the target value compares this to numvnodes,
which is an unsigned int. The signed/unsigned comparision
promotes both sides to unsigned, so that the negative value
isn't interpreted correctly. vfs_drainvnodes returns without
an error.

sysctl_kern_maxvnodes then continues to call vfs_reinit() which
calls union_reinit() which calls hashinit() with the value
that ends in an infinite loop.


There is a second issue in hashinit(), it tries to allocate
hashsize * esize bytes which may overflow for different values
of 'elements'. This just happens to work without extra check
on a current kernel, but not older releases because vnodes
(if positive) are limited to 75% of KVA space.


>How-To-Repeat:
Run /usr/tests/sbin/sysctl/t_perm as root.
>Fix:
Sanity check the maxvnodes value to be >0 and <INT_MAX.

Align signed/unsigned datatypes correctly.

sysctl(1) -> mixed int/u_int
sys___sysctl -> malloc(sizeof(int))
sysctl_kern_maxvnodes -> int
desired_vnodes -> int
vfs_drainvnodes -> long
numvnodes -> u_int
hashinit -> u_int
hashsize -> u_long

Use ffs() to compute the power-of-two ceiling and
sanity check the result. Also test for the hashsize*esize
overflow. Both errors could be treated gracefully like
an out-of-memory condition unless waitok=1, then it
should panic.

>Unformatted:
        
        


Home | Main Index | Thread Index | Old Index