Subject: dlopen() just kills me..
To: None <tech-kern@NetBSD.org>
From: Ian Zagorskih <ianzag@megasignal.com>
List: tech-kern
Date: 02/13/2004 22:23:49
Environment: NetBSD-1.6.1-stable/i386

I have occasionally found quite weird from my point of view "feature" of 
dlopen(). I have some process that need to dynamically load a shared library 
(plugin). All works fine until the library has all symbols resolvable. When 
the library has some unresolved symbols, i found two behaviour:

1. Unresolved data references. dlopen() returns error and in turn dlerror() 
returns something like:

"(/usr/local/share/modules/libconsole.so.1.0: Undefined symbol "devlnk" (reloc 
type = 6, symnum = 33))"

This is great and that's what i expect to see - if dlopen() cannot finally 
resolve loaded stuff it just returns error.

2. Unresolved code references. dlopen() dosn't return and my process just 
terminates with status code 1. Also it prints a message in syslogd like:

"Feb 13 22:00:34 ianzag /usr/local/share/modules/libconsole.so.1.0: Undefined 
PLT symbol "console_open" (reloc type = 7, symnum = 28)"

ktrace/kdump gave me:

---cut---
 16098 dispd    CALL  open(0x4805b100,0,0x2b)
 16098 dispd    NAMI  "/usr/local/share/modules/libconsole.so.1.0"
 16098 dispd    RET   open 6
 16098 dispd    CALL  __fstat13(0x6,0xbfbfd404)
 16098 dispd    RET   __fstat13 0
 16098 dispd    CALL  read(0x6,0xbfbfc3d4,0x1000)
 16098 dispd    GIO   fd 6 read 4088 bytes
 16098 dispd    GIO   fd 6 read 8 bytes
 [lot of lib's data snipped]
 16098 dispd    RET   read 4096/0x1000
 16098 dispd    CALL  mmap(0,0x2000,0x5,0x2,0x6,0,0,0)
 16098 dispd    RET   mmap 1209024512/0x48104000
 16098 dispd    CALL  mmap(0x48105000,0x1000,0x3,0x12,0x6,0,0,0)
 16098 dispd    RET   mmap 1209028608/0x48105000
 16098 dispd    CALL  mmap(0x48106000,0,0x3,0x1012,0xffffffff,0,0,0)
 16098 dispd    RET   mmap 1209032704/0x48106000
 16098 dispd    CALL  close(0x6)
 16098 dispd    RET   close 0
 16098 dispd    CALL  write(0x2,0xbfbfd2c8,0x6e)
 16098 dispd    GIO   fd 2 wrote 110 bytes
       "/usr/local/share/modules/libconsole.so.1.0: Undefined PLT symbol 
"console_open" (reloc type = 7, sym num = 28)  "
 16098 dispd    RET   write 110/0x6e
 16098 dispd    CALL  exit(0x1)
---cut---

...so AFAIU this isn't a bug but feature.

The question is - why ? What's so different between code and data unresolved 
variables ? It it explicitly done or "just so" ? :)

Anyway, in my case the process should load different plugins, probably done by 
different people, so this behaviour is an obvius hole - that's enough for bad 
guy to leave unresolvable code and my server is got suicided.

Any ideas ?

// wbr