tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: rump_server, "panic: uvm_km_alloc failed" on module load



I did some more digging as why rump module loading does not work without
statically linked rump libraries.


Thread 1 "" hit Breakpoint 2, hello_modcmd (cmd=MODULE_CMD_INIT, arg=0x0 <__link_set_modules_sym_hello_modinfo>) at /home/nemanja/source/src/sys/modules/examples/hello/hello.c:50
50              switch (cmd) {
(gdb) disassemble
Dump of assembler code for function hello_modcmd:
   0x000000007ffff000 <+0>:     push   %rbp
   0x000000007ffff001 <+1>:     mov    %rsp,%rbp
   0x000000007ffff004 <+4>:     sub    $0x10,%rsp
   0x000000007ffff008 <+8>:     mov    %edi,-0x4(%rbp)
   0x000000007ffff00b <+11>:    mov    %rsi,-0x10(%rbp)
=> 0x000000007ffff00f <+15>:    cmpl   $0x2,-0x4(%rbp)
   0x000000007ffff013 <+19>:    je     0x7ffff056 <hello_modcmd+86>
   0x000000007ffff015 <+21>:    cmpl   $0x2,-0x4(%rbp)
   0x000000007ffff019 <+25>:    ja     0x7ffff04f <hello_modcmd+79>
   0x000000007ffff01b <+27>:    cmpl   $0x0,-0x4(%rbp)
   0x000000007ffff01f <+31>:    je     0x7ffff029 <hello_modcmd+41>
   0x000000007ffff021 <+33>:    cmpl   $0x1,-0x4(%rbp)
   0x000000007ffff025 <+37>:    je     0x7ffff03c <hello_modcmd+60>
   0x000000007ffff027 <+39>:    jmp    0x7ffff04f <hello_modcmd+79>
   0x000000007ffff029 <+41>:    mov    $0xffffffff80000040,%rdi
   0x000000007ffff030 <+48>:    mov    $0x0,%eax
   0x000000007ffff035 <+53>:    call   0xf7deaa32
   0x000000007ffff03a <+58>:    jmp    0x7ffff057 <hello_modcmd+87>
   0x000000007ffff03c <+60>:    mov    $0xffffffff80000058,%rdi
   0x000000007ffff043 <+67>:    mov    $0x0,%eax
   0x000000007ffff048 <+72>:    call   0xf7deaa32
   0x000000007ffff04d <+77>:    jmp    0x7ffff057 <hello_modcmd+87>
   0x000000007ffff04f <+79>:    mov    $0x19,%eax
   0x000000007ffff054 <+84>:    jmp    0x7ffff05c <hello_modcmd+92>
   0x000000007ffff056 <+86>:    nop
   0x000000007ffff057 <+87>:    mov    $0x0,%eax
   0x000000007ffff05c <+92>:    mov    %rbp,%rsp
   0x000000007ffff05f <+95>:    pop    %rbp
   0x000000007ffff060 <+96>:    ret
End of assembler dump.


Dump of assembler code for function printf:
   0x00007f7ff7deaa32 <+0>:     push   %rbp
   .....
   0x00007f7ff7deaa83 <+81>:    ret
End of assembler dump.


Here module calls printf at 0x00000000-f7deaa32, but actual address of
printf is 0x00007f7f-f7deaa32. Real address of printf is somwhere
around VM_MAXUSER_ADDRESS (0x00007f8000000000 - PAGE_SIZE) as is rest of
rump_server.
While module was mmap-ed around VM_MAXUSER_ADDRESS, segments got
relocated to lowest 2GB by rump implementation of uvm_km_alloc().
Reasons for that is use of 32bit instructions inside amd64 kernel as is
actually documented:

https://nxr.netbsd.org/xref/src/sys/rump/librump/rumpkern/vm.c#845


Now if I change desired address to VM_MAXUSER_ADDRESS from 2GB, I get
correct linking of functions:

Dump of assembler code for function hello_modcmd:
   0x00007f7ff769a000 <+0>:     push   %rbp
   0x00007f7ff769a001 <+1>:     mov    %rsp,%rbp
   0x00007f7ff769a004 <+4>:     sub    $0x10,%rsp
   0x00007f7ff769a008 <+8>:     mov    %edi,-0x4(%rbp)
   0x00007f7ff769a00b <+11>:    mov    %rsi,-0x10(%rbp)
=> 0x00007f7ff769a00f <+15>:    cmpl   $0x2,-0x4(%rbp)
   0x00007f7ff769a013 <+19>:    je     0x7f7ff769a056 <hello_modcmd+86>
   0x00007f7ff769a015 <+21>:    cmpl   $0x2,-0x4(%rbp)
   0x00007f7ff769a019 <+25>:    ja     0x7f7ff769a04f <hello_modcmd+79>
   0x00007f7ff769a01b <+27>:    cmpl   $0x0,-0x4(%rbp)
   0x00007f7ff769a01f <+31>:    je     0x7f7ff769a029 <hello_modcmd+41>
   0x00007f7ff769a021 <+33>:    cmpl   $0x1,-0x4(%rbp)
   0x00007f7ff769a025 <+37>:    je     0x7f7ff769a03c <hello_modcmd+60>
   0x00007f7ff769a027 <+39>:    jmp    0x7f7ff769a04f <hello_modcmd+79>
   0x00007f7ff769a029 <+41>:    mov    $0xfffffffff769b040,%rdi
   0x00007f7ff769a030 <+48>:    mov    $0x0,%eax
   0x00007f7ff769a035 <+53>:    call   0x7f7ff7deaa32 <printf>
   0x00007f7ff769a03a <+58>:    jmp    0x7f7ff769a057 <hello_modcmd+87>
   0x00007f7ff769a03c <+60>:    mov    $0xfffffffff769b058,%rdi
   0x00007f7ff769a043 <+67>:    mov    $0x0,%eax
   0x00007f7ff769a048 <+72>:    call   0x7f7ff7deaa32 <printf>
   0x00007f7ff769a04d <+77>:    jmp    0x7f7ff769a057 <hello_modcmd+87>
   0x00007f7ff769a04f <+79>:    mov    $0x19,%eax
   0x00007f7ff769a054 <+84>:    jmp    0x7f7ff769a05c <hello_modcmd+92>
   0x00007f7ff769a056 <+86>:    nop
   0x00007f7ff769a057 <+87>:    mov    $0x0,%eax
   0x00007f7ff769a05c <+92>:    mov    %rbp,%rsp
   0x00007f7ff769a05f <+95>:    pop    %rbp
   0x00007f7ff769a060 <+96>:    ret
End of assembler dump.

But rest of 32bit instructions flip from 0x00000000-xxxxxxxx to
0xffffffff-xxxxxxxx, instead of 0x00007f7f-xxxxxxxx.

In order to make rump_server module loading work as documented it
would need to be loaded in lowest 2GB on amd64, as I presume, it
was around time that comment was made (2010). I don't know how to
do that.


As for loading modules on arm64, first error message if from
rump_generic_kobj.c:kobj_reloc(). After I copied support for
module loading from arm32 situation is similar as with amd64:
module without any calls from _modcmd loads (e.g. des) but
calls from module to rump_kernel fail with seg fault.



On 8/4/25 19:58, Nemanja Simonovic wrote:
I've managed to make rump_server -m working on amd64. In general
rump_server needs same statically linked libraries as t_modautoload.

Complete Makefile which produces working rump_server:

--------------------
#    $NetBSD: Makefile,v 1.18 2024/04/20 13:24:49 rillig Exp $
#

NOFULLRELRO=    yes

.PATH: ${.CURDIR}/../rump_allserver

PROG=        rump_server
SRCS=        rump_allserver.c
NOMAN=        installed by ../rump_allserver
NOLINT=        # LDADD contains -Wl,...

# Needs executable and writable mmap() when loading modules
PAXCTL_FLAGS=    +ma


# We need to be able to find our own statically linked symbols when linking
# in modules
LDFLAGS+=    -Wl,-E


# Note: we link the rump kernel into the application to make this work
# on amd64.

# If librump is not linked in then calls to in kernel functions in rump
# space don't get resolved properly

# All of these have to be static, otherwise linker gets really angry
# starting with can not find libgcc_s.a

LDADD+= \
     -Wl,--whole-archive \
     -Wl,-Bstatic \
     -lrump_g \
     -lrumpvfs_g \
     -lrumpvfs_nofifofs_g \
     -lrumpkern_sysproxy_g \
     -Wl,-Bdynamic -Wl,--no-whole-archive

LDADD+=     -lpthread  -lrumpuser

.if ${RUMP_SANITIZE:Uno} != "no"
LDADD+=    -fsanitize=${RUMP_SANITIZE}
.endif

.include <bsd.prog.mk>
-----------------------


This breaks loadig of libraries which are already statically linked :
./rump_server -s -v -lrumpvfs -m/stand/amd64/10.99.15/modules/tmpfs/ tmpfs.kmod   unix://sock

Or example from man page:
$ rump_server -lrumpvfs -m /modules/tmpfs.kmod unix://sock

This works as expected:
./rump_server -s -v -lrumpnet -m/stand/amd64/10.99.15/modules/tmpfs/ tmpfs.kmod   unix://sock







On 8/1/25 14:08, Christoph Badura wrote:

I haven't done this before, so no idea if this helps.

gdb has an add-symbol-file command.  And
$OBJDIR/sys/modules/hello/hello/kmod.debug should have symbols.  Note that
that file probably is not installed to /stand.

add-symbol-file needs the address where the .text section was loaded.  I
don't know how to get that.

However, hello.c has only calls to printf().  Perhaps the references to that function are not correctly resolved?  Or maybe they are not resolved to the
correct function.  This should be calling the kernel printf which might
have been renamed to rumpns_printf?  But maybe it doesn't make a
difference.


Thanks to tip from Matthew I was able to load debug symbols for modules
in gdb:

Breakpoint after module is loaded but before call to module
init (kern_module.c:1420)

set $ko = mod->mod_kobj
set $n = mod->mod_info.mi_name
eval "add-symbol-file %s/%s/%s.kmod -s .text 0x%lx -s .data 0x%lx - s .rodata 0x%lx\n", module_base, $n, $n, $ko->ko_text_address, $ko- >ko_data_address, $ko->ko_rodata_address


With that it is possible to set breakpoints inside module. Inspecting
address functions for t_modautoload and rump_server without static
libraries I get this addresses for functions:


t_modautoload
(gdb) print vfs_attach
$3 = {int (struct vfsops *)} 0x9aa57 <vfs_attach>

(gdb) print printf
$4 = {void (const char *, ...)} 0x107f8e <printf>


rump_server
(gdb) print vfs_attach
$3 = {int (struct vfsops *)} 0x7f7ff7d1885a <vfs_attach>

(gdb) print printf
$1 = {void (const char *, ...)} 0x7f7ff7deca42 <printf>


Module is loaded at 0x7f7ff777fd00.

(gdb) info sharedlibrary
From                To                  Syms Read   Shared Object Library
0x00007f7ff7eed000  0x00007f7ff7ef84f7  Yes         /usr/libexec/ld.elf_so
0x00007f7ff7edf150  0x00007f7ff7edf930  Yes /usr/lib/ librumpkern_sysproxy.so.0
0x00007f7ff7d90320  0x00007f7ff7e72d00  Yes         /usr/lib/librump.so.0
0x00007f7ff7d4c040  0x00007f7ff7d4c0fc  Yes /usr/lib/ librumpvfs_nofifofs.so.0 0x00007f7ff7cc6d00  0x00007f7ff7d28edb  Yes         /usr/lib/ librumpvfs.so.0
0x00007f7ff7c8b830  0x00007f7ff7c91633  Yes /usr/lib/librumpuser.so.0
0x00007f7ff7c77c80  0x00007f7ff7c7eb6b  Yes         /usr/lib/ libpthread.so.1
0x00007f7ff7872550  0x00007f7ff79eab5d  Yes         /usr/lib/libc.so.12

And address of printf is correct:

(gdb) disassemble 0x7f7ff7deca42
Dump of assembler code for function printf:
    0x00007f7ff7deca42 <+0>:     push   %rbp
    0x00007f7ff7deca43 <+1>:     mov    %rsp,%rbp
    0x00007f7ff7deca46 <+4>:     sub    $0x60,%rsp
    0x00007f7ff7deca4a <+8>:     mov    %rdi,-0x58(%rbp)
    0x00007f7ff7deca4e <+12>:    mov    %rsi,-0x28(%rbp)
    0x00007f7ff7deca52 <+16>:    mov    %rdx,-0x20(%rbp)
    0x00007f7ff7deca56 <+20>:    mov    %rcx,-0x18(%rbp)
    0x00007f7ff7deca5a <+24>:    mov    %r8,-0x10(%rbp)
    0x00007f7ff7deca5e <+28>:    mov    %r9,-0x8(%rbp)
    0x00007f7ff7deca62 <+32>:    movl   $0x8,-0x48(%rbp)
    0x00007f7ff7deca69 <+39>:    lea    0x10(%rbp),%rax
    0x00007f7ff7deca6d <+43>:    mov    %rax,-0x40(%rbp)
    0x00007f7ff7deca71 <+47>:    lea    -0x30(%rbp),%rax
    0x00007f7ff7deca75 <+51>:    mov    %rax,-0x38(%rbp)
    0x00007f7ff7deca79 <+55>:    lea    -0x48(%rbp),%rdx
    0x00007f7ff7deca7d <+59>:    mov    -0x58(%rbp),%rax
    0x00007f7ff7deca81 <+63>:    mov    %rax,%rsi
    0x00007f7ff7deca84 <+66>:    mov    $0x5,%edi
   0x00007f7ff7deca89 <+71>:    call   0x7f7ff7d8d0b0 <rumpns_vprintf_flags@plt>
    0x00007f7ff7deca8e <+76>:    nop
    0x00007f7ff7deca8f <+77>:    mov    %rbp,%rsp
    0x00007f7ff7deca92 <+80>:    pop    %rbp
    0x00007f7ff7deca93 <+81>:    ret
End of assembler dump.


But address of rumpns_vprintf_flags doesn't look right:

(gdb) disassemble 0x7f7ff7d8d0b0
Dump of assembler code for function rumpns_vprintf_flags@plt:
   0x00007f7ff7d8d0b0 <+0>:     jmp    *0x12b792(%rip)        # 0x7f7ff7eb8848 <rumpns_vprintf_flags%got.plt@localhost>
    0x00007f7ff7d8d0b6 <+6>:     push   $0x109
    0x00007f7ff7d8d0bb <+11>:    jmp    0x7f7ff7d8c010
End of assembler dump.


In fact gdb thinks there is quite a few 16 byte functions there:

(gdb) x/64x 0x7f7ff7d8d0b0
0x7f7ff7d8d0b0 <rumpns_vprintf_flags@plt>:      0xff    0x25    0x92 0xb7    0x12    0x00    0x68    0x09 0x7f7ff7d8d0b8 <rumpns_vprintf_flags@plt+8>:    0x01    0x00    0x00 0xe9    0x50    0xef    0xff    0xff 0x7f7ff7d8d0c0 <rumpns_module_init@plt>:        0xff    0x25    0x8a 0xb7    0x12    0x00    0x68    0x0a 0x7f7ff7d8d0c8 <rumpns_module_init@plt+8>:      0x01    0x00    0x00 0xe9    0x40    0xef    0xff    0xff 0x7f7ff7d8d0d0 <rumpns_config_match@plt>:       0xff    0x25    0x82 0xb7    0x12    0x00    0x68    0x0b 0x7f7ff7d8d0d8 <rumpns_config_match@plt+8>:     0x01    0x00    0x00 0xe9    0x30    0xef    0xff    0xff 0x7f7ff7d8d0e0 <rump_thread_init@plt>:  0xff    0x25    0x7a    0xb7 0x12    0x00    0x68    0x0c 0x7f7ff7d8d0e8 <rump_thread_init@plt+8>:        0x01    0x00    0x00 0xe9    0x20    0xef    0xff    0xff




This looks better.  I was wondering why "/module.mod"?  And discovered
this:
https://nxr.netbsd.org/xref/src/usr.bin/rump_allserver/ rump_allserver.c#411

I guess that should be "modarray[i]", otherwise with multiple "-m" it loads
the first module multiple times, doesn't it?

Right, it should be "modarray[i]", tries to load first module multiple
times as it is now. Tested with 2 modules, loads ok.

Both modarray[0] and "define ETFSKEY /module.mod" look to me like this
was work in progress.



Seeing not that static linking is needed for rump module loading to
work on amd64 I tried testing rump_server -m on arm64, and it has other
problem:

$ rump_server -s -v -lrumpvfs -lrumpvfs_nofifofs -lrump -m /stand/ evbarm/10.99.14/modules/kernfs/kernfs.kmod   unix://sock
[   1.0000050] root file system type: rumpfs
[   1.0000050] kern.module.path=/stand/evbarm/10.99.14/modules
[   1.0200050] warning: kernel ABI not supported on this arch
[   1.0200050] panic: kobj_reloc: not supported on this architecture
[   1.0200050] rump kernel halting...


   Nemanja






Home | Main Index | Thread Index | Old Index