NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-m68k/55990: kernel stack leak in m68k cpu_setmcontext() and reenter_syscall()



The following reply was made to PR port-m68k/55990; it has been noted by GNATS.

From: Izumi Tsutsui <tsutsui%ceres.dti.ne.jp@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: tsutsui%ceres.dti.ne.jp@localhost
Subject: Re: port-m68k/55990: kernel stack leak in m68k cpu_setmcontext() and
	 reenter_syscall()
Date: Sun, 21 Feb 2021 21:39:12 +0900

 > cpu_setmcontext: stack (&sz) = 0x39c2bec restore frame (format=11, sz=84) ->reenter_syscall
 > cpu_setmcontext: stack (&sz) = 0x39c2b98 restore frame (format=11, sz=84) ->return
 > cpu_setmcontext: stack (&sz) = 0x39c2b98 restore frame (format=11, sz=84) ->reenter_syscall
  :
 > >Fix:
 > It looks m68k/reenter_syscall.s adjusts stack pointer to prepare
 > "moved stack frame by stkadj bytes" but doesn't restore %sp
 > after syscall() is returned?
 > (I'm not sure how reenter_syscall() was designed though)
 
 After misc observations, these stack leaks seem caused by:
 1) heavy setcontext(2) calls from pthread applications (i.e. Xorg server)
 2) heavy address errors (i.e. page faults) on lower RAM (<24MB) environment
 
 Per various outputs on ddb(4) with patched NetBSD/x68k 9.1 GENERIC on XM6i,
 the following sequence are observed in setcontext(2):
 - setcontext(2) system call is invoked via trap0()
 - trap0() -> syscall() -> syscall_plain() -> sys_setcontext()
   -> setucontext() -> cpu_setmcontext() is called
 - in cpu_setmcontext() a frame format type to be restored to the frame
   (mcp->__mc_pad.__mc_frame.__mcf_format) is FMTB (triggered by the
   prior address error?), so it jumps to reenter_syscall() to allocate
   extra stack space for FMTB (84 bytes)
    https://nxr.netbsd.org/xref/src/sys/arch/m68k/m68k/sig_machdep.c?r=1.50#297
 - reenter_syscall() adjust whole stack frame to store FMTB and calls
   syscall() again with updated stack frame
    https://nxr.netbsd.org/xref/src/sys/arch/m68k/m68k/reenter_syscall.s?r=1.6#34
 - cpu_setmcontext() is called again, and it restores FMTB frame to
   the updated stack frame and resets frame->f_stackadj to zero
    https://nxr.netbsd.org/xref/src/sys/arch/m68k/m68k/sig_machdep.c?r=1.50#316
 - cpu_setmcontext() just returns with frame->f_stackadj=0
 - after cpu_setmcontext() (a syscall entry function) returns,
   machine_userret() (in MD trap.c) is called
    https://nxr.netbsd.org/xref/src/sys/arch/x68k/x68k/trap.c?r=1.108#241
    https://nxr.netbsd.org/xref/src/sys/arch/x68k/x68k/trap.c?r=1.108#174
 - machine_userret() -> lwp_userret() -> postsig() -> sendsig_siginfo()
    -> cpu_getmcontext() is called
    https://nxr.netbsd.org/xref/src/sys/sys/userret.h?r=1.28#95
    https://nxr.netbsd.org/xref/src/sys/arch/m68k/m68k/sig_machdep.c?r=1.50#175
 - m68k cpu_getmcontext() adjusts lwp's frame->f_stackadj and frame->f_format 
    https://nxr.netbsd.org/xref/src/sys/arch/m68k/m68k/sig_machdep.c?r=1.50#232
   -> the kernel stack needs to be adjusted before returning userland?
 - syscall() finally returns to reenter_syscall() with the updated
   frame (FMT0), but reenter_syscall() doesn't check frame->f_stackadj
   and then the kernel stack is left as storing FMTB frame, so
   the extra 84 bytes allocated for FMTB is leaked?
 
 I have not tracked which function actually updates SR and PC in the FMT0
 frame on returning syscall(), but the following patch that adjusts stack
 per frame->f_stackadj (as existing faultstkadj() in m68k/trap_subr.s does)
 seems to fix this leak.
 
 With this patch, the Xorg based servers both on sun3 and x68k survive
 over 24 hours without kernel crashes.
 (Note the similar x68k crashes were observed at least back in 2012.)
 
 I wonder if we should also check frame->f_stackadj in trap0()
 but I would like to commit this fix (workaround?) for now.
 
 Any comments (especially from m68k and siginfo gurus; kleink@? thorpej@?)
 are appreciated.
 
 ---
 
 Index: reenter_syscall.s
 ===================================================================
 RCS file: /cvsroot/src/sys/arch/m68k/m68k/reenter_syscall.s,v
 retrieving revision 1.6
 diff -u -p -d -r1.6 reenter_syscall.s
 --- reenter_syscall.s	21 Feb 2021 07:23:41 -0000	1.6
 +++ reenter_syscall.s	21 Feb 2021 11:41:56 -0000
 @@ -51,6 +51,19 @@ ENTRY_NOPROFILE(reenter_syscall)
  #endif
  	moveal	FR_SP(%sp),%a0		| grab and restore
  	movel	%a0,%usp		|   user SP
 +	movw	FR_ADJ(%sp),%d0		| need to adjust stack?
 +	jne	.Ladjstk		| yes, go to it
  	moveml	(%sp)+,#0x7FFF		| restore user registers
  	addql	#8,%sp			| pop SP and stack adjust
  	jra	_ASM_LABEL(rei)		| rte
 +.Ladjstk:
 +	lea	FR_HW(%sp),%a1		| pointer to HW frame
 +	addql	#8,%a1			| source pointer
 +	movl	%a1,%a0			| source
 +	addw	%d0,%a0			|  + hole size = dest pointer
 +	movl	-(%a1),-(%a0)		| copy
 +	movl	-(%a1),-(%a0)		|  8 bytes
 +	movl	%a0,FR_SP(%sp)		| new SSP
 +	moveml	(%sp)+,#0x7FFF		| restore user register
 +	movl	(%sp),%sp		| and do real RTE
 +	jra	_ASM_LABEL(rei)		| rte
 
 ---
 Izumi Tsutsui
 


Home | Main Index | Thread Index | Old Index