tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Checking library symbols



On Sat, Mar 16, 2024 at 10:21:21PM +0000, Taylor R Campbell wrote:
> [bcc tech-toolchain, followups to tech-userlevel to stay in one place]
> 
> Every now and then we have some embarrassing bug with accidental
> removal or exposure of symbols in a shared library.  This has been a
> constant problem with libm (e.g., https://gnats.NetBSD.org/57960,
> https://gnats.NetBSD.org/57881, https://gnats.NetBSD.org/56577) but it
> can also affect other libraries too.

I knocked together a remarkably similar thing for FreeBSD week or so ago.
See https://reviews.freebsd.org/D44271 and related changes in the
stack.

After seeing your use of nm I took a look because it looks cleaner than
my objdump based approach.  Unfortunately, nm(1) implementations seem to
vary widely with regard to symbol versions.  GNU nm supports
--with-symbol-versions.  ELF toolchain (currently FreeBSD default) does
not.  llvm-nm doesn't but seems to act as if --with-symbol-versions was
passed.  I'm not sure if that will be an issue for NetBSD, but it's kind
of annoying in general.

> The attached patch creates a build-time mechanism to audit libraries
> for unintentional symbol changes with bsd.lib.mk:
> 
> - To opt in, you write down a sorted list of all the symbols (and, if
>   applicable, version suffixes) expected to be defined by a library
>   LIB=foo in foo.expsym.
> 
> - When the library is built, bsd.lib.mk will use nm(1) to generate a
>   sorted list of symbols actually defined in libfoo.so(.M.N), and diff
>   it against foo.expsym.
> 
> If there are differences, the build will fail and the differences will
> be printed.  Only symbol names and versions are checked, not addresses
> or types.
> 
> This check is orthogonal to writing a version map: it doesn't require
> changing the library itself to opt into symbol versions
> (https://wiki.NetBSD.org/symbol_versions), and it checks for missing
> symbols too which ld doesn't check for even if they're written in the
> version map.

FWIW, ld with --no-undefined-version (now the default in LLVM 16+) does
detect lost symbols if they are in the file, but it doesn't detect
versions exposed by inline assembly.  (At least in FreeBSD we use
inline assembly macros to export non-default (and very rarely default)
symbols.)

> So far this is experimental -- I only tried it with openssl libcrypto
> to verify another experiment with renaming sha2 symbols.  It's limited
> to shared libraries, not static libraries, for now.  In practice we
> might require more mechanism to handle MD variation in the set of
> defined symbols (especially with libm), and we might encounter other
> problems I haven't foreseen if we try to apply this to more libraries,
> but I thought I'd share a first draft.

I ended up doing optional, separate files for each architecture.  It's
not optimal (mostly duplicated content), but it's easy.

-- Brooks

> 
> Thoughts?

> From aa30c139d6874d9f7733484474e804622f0b0dce Mon Sep 17 00:00:00 2001
> From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
> Date: Sat, 16 Mar 2024 21:53:41 +0000
> Subject: [PATCH] bsd.lib.mk: Check expected vs actual symbols at build-time.
> 
> If, for LIB=foo, you create a file foo.expsym, bsd.lib.mk will list
> the dynamic symbols and their versions with
> 
> nm --dynamic --extern-only --defined-only --with-symbol-versions
> 
> and compare the names (not addresses or types) to foo.expsym.  If
> there are any differences, they will be printed and the build will
> fail.
> 
> foo.expsym should be sorted with `LANG=C sort -u'.
> 
> This way, you can verify changes don't inadvertently add or remove
> symbols.  If you do want to add (or, if you're bumping the major,
> remove) symbols, you can verify the changes and edit the foo.expsym
> file accordingly.  This will also help to enforce rules about symbol
> changes on pullups in release branches.
> 
> Note that using a version map (-Wl,--version-script=...) doesn't
> catch symbol removal -- ld quietly ignores symbols in the version map
> that aren't actually defined by any object in the library.  So this
> supplements the version map.
> ---
>  share/mk/bsd.lib.mk | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/share/mk/bsd.lib.mk b/share/mk/bsd.lib.mk
> index 4db7809dda40..9df4bcc834f3 100644
> --- a/share/mk/bsd.lib.mk
> +++ b/share/mk/bsd.lib.mk
> @@ -648,6 +648,26 @@ ${_LIB.so.full}: ${_MAINLIBDEPS}
>  	${HOST_LN} -sf ${_LIB.so.full} ${_LIB.so}.tmp
>  	${MV} ${_LIB.so}.tmp ${_LIB.so}
>  
> +# If there's a file listing expected symbols, fail if the diff from it
> +# to the actual symbols is nonempty, and show the diff in that case.
> +.if exists(${.CURDIR}/${LIB}.expsym)
> +realall: ${_LIB.so.full}.diffsym
> +${_LIB.so.full}.diffsym: ${LIB}.expsym ${_LIB.so.full}.actsym
> +	${_MKTARGET_CREATE}
> +	if diff -u ${.ALLSRC} >${.TARGET}.tmp; then \
> +		${MV} ${.TARGET}.tmp ${.TARGET}; \
> +	else \
> +		ret=$$?; \
> +		cat ${.TARGET}.tmp; \
> +		exit $$ret; \
> +	fi
> +${_LIB.so.full}.actsym: ${_LIB.so.full}
> +	${NM} --dynamic --extern-only --defined-only --with-symbol-versions \
> +		${_LIB.so.full} \
> +	| cut -d' ' -f3 | sort -u >${.TARGET}.tmp
> +	${MV} ${.TARGET}.tmp ${.TARGET}
> +.endif
> +
>  .if !empty(LOBJS)							# {
>  LLIBS?=		-lc
>  ${_LIB.ln}: ${LOBJS}



Home | Main Index | Thread Index | Old Index