tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Auxiliary header and macros for sanitizers in userland programs



We've faced a problem with sanitizing part of the NetBSD userland, as we
need to use helper functions to make sanitization possible in some
narrow cases that aren't clear for sanitizers.

The current problem is the usage of callback functions defined in
programs and executed from the internals of libc.

This is true for sorting functions where we can specify a comparison
function, e.g. in qsort(3):

     void
     qsort(void *base, size_t nmemb, size_t size,
         int (*compar)(const void *, const void *));

The same scenario is in heapsort(3) and mergesort(3), and their users:
 - fts_open(3)
 - alphasort(3)
 - scandir(3)
 - tdelete(3) twalk(3) tfind(3) tsearch(3)
 - bsearch(3)

Once a callback function is executed from the internals of libc, a
sanitized program does not know whether the arguments passed to it are
properly initialized.

Two examples:

    # modstat
    Uninitialized bytes in int __interceptor_strcmp(const char *, const
char *) at offset 0 inside [0x731000000000, 1)
    ==19613==WARNING: MemorySanitizer: use-of-uninitialized-value
        #0 0x21dc89 in modstatcmp /public-dyn/src/sbin/modstat/main.c:246:9
        #1 0x7f7ff692d581 in qsort (/lib/libc.so.12+0x12d581)
        #2 0x21ca82 in main /public-dyn/src/sbin/modstat/main.c:181:2
        #3 0x21b341 in ___start (/sbin//modstat+0x1b341)
    SUMMARY: MemorySanitizer: use-of-uninitialized-value
/public-dyn/src/sbin/modstat/main.c:246:9 in modstatcmp
    Exiting

# ls
==11267==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x222750 in mastercmp /public-dyn/src/bin/ls/ls.c:691:17
    #1 0x7f7ff63bc435 in med3 /public-dyn/src/lib/libc/stdlib/qsort.c:94:9
    #2 0x7f7ff63bbe12 in qsort /public-dyn/src/lib/libc/stdlib/qsort.c:128:8
    #3 0x7f7ff630ce6b in fts_sort /public-dyn/src/lib/libc/gen/fts.c:1028:2
    #4 0x7f7ff630e39b in fts_build /public-dyn/src/lib/libc/gen/fts.c:915:10
    #5 0x7f7ff630e8b9 in __fts_children60
/public-dyn/src/lib/libc/gen/fts.c:616:18
    #6 0x256e2d in __interceptor___fts_children60
/public-dyn/llvm/projects/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:7368:12
    #7 0x2221a7 in traverse /public-dyn/src/bin/ls/ls.c:470:10
    #8 0x22149d in ls_main /public-dyn/src/bin/ls/ls.c:405:3
    #9 0x226ab1 in main /public-dyn/src/bin/ls/main.c:48:9
    #10 0x21b531 in ___start (/bin/ls+0x1b531)

SUMMARY: MemorySanitizer: use-of-uninitialized-value
/public-dyn/src/bin/ls/ls.c:691:17 in mastercmp
Exiting


Possible solutions:
 1. Reimplement libc functions inside sanitizers
 2. Copy part of libc code into sanitizers source code and build it
along the sanitizers.
 3. Inject __msan_unpoison()-like functions inside libc, optionally
under MKSANITIZER switch.
 4. Use auxiliary sanitizer functions/macros inside programs that need
it and enable it in the mode of being built with a sanitzier.

I've wend through points 1-4:
 1. Isn't really doable. Functionality duplication, maintenance burden
and both implementations will go out of sync. While it might be
theoretically possible for sorting functions, reimplementing
fts_open(3)-like features is too much work. Upstream would likely reject it.
 2. Not applicable for upstream. Someone would need to keep both copies
in sync. We will end up with two different implementations of features
like fts_open(3) built through -fsanitize and standalone.
 3. Adding any symbols to libc is taxed. There is need to use a
preprocessed libc in order to sanitize some programs using plain libc.
Also these symbols are injected in performance critical paths like in
every execution of the callback in a sorting function.
 4. This explicitly restricts the usage of helper functions to the
programs that need it and they are built with a sanitizer. No libc
replacement is needed.

The rest of the world does the same as in point 4., this is already the
common usage in 3rd party software like: libuv, mozjs, rr, iotjs,
libcrypto++, julia, openssl, firefox etc.



I've prepared a <sanitizer.h> header that intends to abstract inclusion
of sanitizer specific headers in userland programs and export macros for
programs. If a program is not built with a sanitizer, the macro is
evaluated into a dummy line of code.

Proposed patch with a new header and patched ls(1).

http://netbsd.org/~kamil/patch-00049-ls-msan.txt

With the above diff, ls(1) can execute under Memory Sanitizer correctly.

The patch includes support for ASan, MSan and TSan. UBSan does not need
a dedicated header. The MSan support is restricted to Clang/LLVM only.
Other sanitizers (ESan, DFsan, Scudo, HWASan, LSan) are right now skipped.

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index