NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: port-evbmips/59236 (Multiple segfaults in erlite3 boot)
On 2025/04/19 4:04, riastradh%NetBSD.org@localhost wrote:
Synopsis: Multiple segfaults in erlite3 boot
State-Changed-From-To: open->feedback
State-Changed-By: riastradh%NetBSD.org@localhost
State-Changed-When: Fri, 18 Apr 2025 19:04:59 +0000
State-Changed-Why:
This is probably the the same CN50xx bug that we have been puzzling
over in PR port-mips/59064: jemalloc switch to 5.3 broke userland
<https://gnats.NetBSD.org/59064>.
Can you try the patch at the bottom of this message?
https://mail-index.NetBSD.org/netbsd-bugs/2025/04/14/msg088307.html
Thank you very much for working on this problem!
However, unfortunately, even with your patch, erlite3 cannot boot
into multiuser mode, both for n64 and n32 userlands:
https://gist.github.com/rokuyama/7bbe1619e55e8e3aba5bf3b112a23725
On the other hand, MIPSSIM64 kernel on QEMU successfully boots into
multiuser mode.
In the above-mentioned log, debug printf is enabled for trap():
```
diff --git a/sys/arch/mips/mips/trap.c b/sys/arch/mips/mips/trap.c
index 58caf19e2d2..a079dec91dd 100644
--- a/sys/arch/mips/mips/trap.c
+++ b/sys/arch/mips/mips/trap.c
@@ -448,8 +448,8 @@ trap(uint32_t status, uint32_t cause, vaddr_t vaddr,
vaddr_t pc,
rv = uvm_fault(map, va, ftype);
pcb->pcb_onfault = onfault;
-#if defined(VMFAULT_TRACE)
- if (!KERNLAND_P(va))
+#if defined(VMFAULT_TRACE) || 1
+ if (!KERNLAND_P(va) && rv != 0)
printf(
"uvm_fault(%p (pmap %p), %#"PRIxVADDR
" (%"PRIxVADDR"), %d) -> %d at pc %#"PRIxVADDR"\n",
```
You can see SEGVs are caused by read access to NULL:
```
[ 13.3599689] uvm_fault(0x980000041f9c0c00 (pmap 0x980000041fce44d0), 0
(0), 1) -> 14 at pc 0xfff83b1db4
[1] Segmentation fault (core dumped) /sbin/ifconfig lo0 inet6
>/dev/null 2>&1
...
[ 19.5399661] uvm_fault(0x980000041f20c800 (pmap 0x980000041fce44d0), 0
(0), 1) -> 14 at pc 0xfff8391db4
[1] Segmentation fault (core dumped) awk "/^sendmail[ \t]/{print\$2}"
/etc/mailer.conf
```
As you pointed out earlier, SEGVs can be avoided by replacing
`user_reserved_insn` with `user_gen_exception`, i.e.:
https://gist.github.com/rokuyama/c7a50b8e7a62dc25f3f536f1434eea9b
By grep'ping into Linux codes, I've found they check TLB entry
for PC before fetching it:
https://github.com/torvalds/linux/commit/5b10496b6e65#diff-bbe4c1a54ce7bd13e6109d887383993c3b5276a1362f84092e9ef31dc84064d9R390
and our `user_gen_exception` path uses copyin(9), of course.
I don't know ~anything for mips, and much more destructive results
may happen for this "double-fault scenario", although...
Thanks,
rin
If you open one of the core dumps in gdb (if you are able to do that
from another machine where everything isn't segfaulting all the time,
e.g. if the core dump is written to nfs) and do `x/i $pc' and `bt', I
bet you will find it in malloc_default (via some stack trace through
jemalloc) at this instruction:
00008a58 <malloc_default>:
malloc_default():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
8a58: 27bdff70 addiu sp,sp,-144
8a5c: ffbc0078 sd gp,120(sp)
8a60: 3c1c0000 lui gp,0x0
8a60: R_MIPS_GPREL16 malloc_default
8a60: R_MIPS_SUB *ABS*
8a60: R_MIPS_HI16 *ABS*
8a64: 0399e021 addu gp,gp,t9
8a68: 279c0000 addiu gp,gp,0
8a68: R_MIPS_GPREL16 malloc_default
8a68: R_MIPS_SUB *ABS*
8a68: R_MIPS_LO16 *ABS*
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
8a6c: 8f820000 lw v0,0(gp)
8a6c: R_MIPS_TLS_GOTTPREL je_tsd_tls
8a70: 7c03e83b 0x7c03e83b
malloc_default():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
8a74: ffb10040 sd s1,64(sp)
8a78: ffb00038 sd s0,56(sp)
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
8a7c: 00433021 addu a2,v0,v1
malloc_default():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
8a80: ffbf0088 sd ra,136(sp)
8a84: ffbe0080 sd s8,128(sp)
8a88: ffb70070 sd s7,112(sp)
8a8c: ffb60068 sd s6,104(sp)
8a90: ffb50060 sd s5,96(sp)
8a94: ffb40058 sd s4,88(sp)
8a98: ffb30050 sd s3,80(sp)
8a9c: ffb20048 sd s2,72(sp)
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:422
=> 8aa0: 90c30258 lbu v1,600(a2)
And I bet you will find that $v0 holds the address malloc_default+0x18,
i.e., the pc of this instruction:
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
8a6c: 8f820000 lw v0,0(gp)
8a6c: R_MIPS_TLS_GOTTPREL je_tsd_tls
=> 8a70: 7c03e83b 0x7c03e83b
The instruction 0x7c03e83b is sometimes also written
rdhwr $3,$29
or
rdhwr v1,ulr
but it is architecturally undefined so it traps to the kernel to
emulate, and the kernel is supposed to return the thread's tcb pointer
in v1.
But as a side effect, the emulation clobbers the register v0 with the
address of the excepting instruction, rather than leaving it as the
value it found at -1234(gp) (or whatever; written as 0(gp) above, but
the linker will replace it by some probably-nonzero number; you can use
`objdump --disassemble=malloc_default libc.so' to find it), which is
decidedly not the instruction address malloc_default+0x18 but rather
some tls offset that is reasonable to add to the tcb pointer.
Now, the emulation routine
https://nxr.netbsd.org/xref/src/sys/arch/mips/mips/mipsX_subr.S?r=1.115#1297
is not _supposed_ to clobber v0 -- it goes out of its way to save v0 on
the kernel stack and restore it before returning from the exception:
1312 /* Need two working registers */
1313 REG_S AT, CALLFRAME_SIZ+TF_REG_AST(k0)
1314 REG_S v0, CALLFRAME_SIZ+TF_REG_V0(k0)
...
1349 REG_L AT, CALLFRAME_SIZ+TF_REG_AST(k0)# restore reg
1350 REG_L v0, CALLFRAME_SIZ+TF_REG_V0(k0) # restore reg
1351 eret
But, in all my trials, it has been consistently corrupted in the same
way. The best theory we have for why it is corrupted is cn50xx CPUs --
found in erlite3 (but not er4) -- have some kind of register-writeback
bug (which passes through some register renaming unchanged) provoked by
the particular combination of reading MIPS_COP_0_EXC_PC and eret so
that after the eret, the exception pc gets written back to v0 even
though we just restored v0 from the kernel stack.
So, all that said, here is a summary of the science we did on my
erlite3, together with a patch that seems to address the issue and --
under the theory that it is the register that we move MIPS_COP_0_EXC_PC
into -- will only corrupt a temporary register k0 which is not
accessible to userland and treated as garbage on any kernel entry
points, so it's safe:
https://mail-index.NetBSD.org/netbsd-bugs/2025/04/14/msg088307.html
Home |
Main Index |
Thread Index |
Old Index