tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Clash of lib64 symlink and R extensions in wrappers



Am Tue, 26 May 2020 12:19:46 -0400
schrieb Greg Troxel <gdt%lexort.com@localhost>:

> I am not sure that Linux really supports it either; when you have a
> nominally x86_64 Linux system, and you build a program, I don't think
> you get both lib and lib64 variants installed.

The default is 64 bit binaries, but you can build 32 bit programs and
install them side-by-side. You can install 32 and 64 bit packages from
the distro and run both. A classic usecase was running 32 bit wine for
games on a 64 bit Linux host. Or … I faintly remember something
horrible about the flash player browser plugin … the newer multiarch
approach added more flexibility, not just lib64 and lib, but
lib/<arch>. It also seems designed to ease cross builds by just
installing the library packages for the target architecture on your host.

The rationale for multiarch is here:

	https://wiki.debian.org/Multiarch/TheCaseForMultiarch

This is mostly motivated by binary distributions that build for all
possible architectures. It's sort of neat to be able to keep the libs
for native and non-native architectures in one tree, but of course it's
not the only solution. And pkgsrc is in a situation where it has to
take the lib64 or lib/<arch> behaviour of the base system as given.

It seems that this is ingrained in gcc now. Does it not do that on *BSD?

> It's an interesting question of how to address this.   I don't think an
> approach which requires per-package accomodation of lib64 is reasonable.

Agreed. So the issue right now is only that my addition of the lib64
symlink breaks some builds as the directory (link) is seen by the
configure scripts, but not actually usable as pkgsrc wrappers kill it.
Seems to me like a hack on the wrappers that just also wraps the link
would be a short-time fix.

Hm … or … looking at the search dirs and the multiarch stuff again:

$ LIBRARY_PATH=/foo/lib .prefix/gcc8/bin/gcc --print-search-dirs | grep ^libraries:|cut -f 2 -d = | tr : '\n'
/foo/lib/x86_64-redhat-linux/8.3.0/
/foo/lib/../lib64/
/sw/env/gcc-8.4.0_openmpi-3.1.6/pkgsrc/2020Q1/gcc8/lib/gcc/x86_64-redhat-linux/8.3.0/
/sw/env/gcc-8.4.0_openmpi-3.1.6/pkgsrc/2020Q1/gcc8/lib/gcc/x86_64-redhat-linux/

… the first directory is actually a subdirectory of /foo/lib/. So if I add a link

$pkgsrc_prefix/lib/x86_64-redhat-linux/8.3.0/ → ../../

(or actually 8.4.0 in the end for my own toolchain)

things might work out, too. I'm not sure if such a path will be picked
up by R extensions, or if pkgsrc wrappers will allow it or just ignore
it, but it avoids a loplevel lib64 confusing build scripts.

The /foo/lib/../lib64 name is a common fallback differing Linux distro
builds of gcc agree on. On Ubuntu, I see this:

libraries: =/foo/lib/x86_64-linux-gnu/5/:/foo/lib/x86_64-linux-gnu/:/foo/lib/../lib/

So, just lib/x86_64-linux-gnu → ../ would be enough here. But my own
build of newer gcc on that system shows

libraries: =/foo/lib/x86_64-pc-linux-gnu/8.3.0/:/foo/lib/x86_64-linux-gnu/:/foo/lib/../lib64/:/

That'd still be lib/x86_64-linux-gnu, among the noise of other
directories and even another host string (x86_64-pc-linux-gnu). I hope
it's understandable that I'm a bit annoyed by this. It seems it was
decided that all the multiarch logic is played out in the background
for environment variables that modify the default search path.

The inconvenient thing is that I have to figure out the proper path for
this link, as there's even the compiler version in there. I can try to
hack a link to the output of

  LIBRARY_PATH=/foo/lib gcc -L/foo/lib --print-search-dirs \
  | grep ^libraries: | cut -f 2 -d = | tr ':' '\n' \
  | grep ^/foo/lib|grep -v '\.\.' | head -n 1

This should produce a subdirectory of lib that works to override system
libraries and maybe doesn't confuse builds inside pkgsrc that stumble
over lib64. But I guess the wrapper would still only link that
directory if there are PLIST paths mentioning it, right?

> How many things use this LIBDIR environment variable?

It's not things, it's people;-) The interest for PATH-like variables is
pronounced in the management of multiuser environments. In the HPC
community, we probably constitute the last large-scale usecase of
Unix-like servers where multiple users log into shell sessions on the
same bare-metal OS instance and build/run programs as part of the
regular operation and not due to an exploit.

In such a multiuser environment that usually lives for a period around
3 to 7 years with the hardware, the need for a flexible software
environment is long-established. During the period of operation, new
software versions are needed, while people still want to continue to
use the old ones, for example to finish a scientific task with a
controlled unchanged set of programs, or just to compare results with
newer and older software. It is not unheard of to carry over software
environments from a decommissioned system to a new one, too.

This is combined with the tradition of system vendors providing tuned
compilers and certain libraries for their systems. There are also
choices of differing add-on compilers and libraries providing some
improvements or adaptions to certain use cases compared to libraries
possibly present in the base system. Serial and parallel variants. The
traditional HPC user took what software the system offers and built
their own software on top of that, selecting vendors and versions as
blocks for the build environment.

One way is to manually hunt down the paths to desired binaries and
libraries and stuff things into FFLAGS, CFLAGS, LDFLAGS in Makefiles.
Actually, Makefiles often contain(ed) hardcoded values for compiler names
and flags, one reason for certain compilers being needed for certain
software. Another reason is that numerical software is not unexpected
to break down when you change compilers or optimization flags … tuned
to the bugs of the build environment of the authors.

A slightly saner way to get your build environment set up emerged some
… hm, I guess decades … ago with environment modules.

	https://en.wikipedia.org/wiki/Environment_Modules_(software)

There are multiple implementations, but they boil down to a shell
function that reads modulefiles and can add and remove their contents
from your environment. This includes setting and deleting a certain
variable (things like CUDA_HOME), but mostly handles array variables
like PATH. It can append or prepend paths. When you unload a module,
its paths are removed from the environment. You can switch
environments. This is the main difference from one-off environment
scripts that you are supposed to source in your session.

For example, this is what an environment module for a pkgsrc tree does
here (on top of a required toolchain module):

prepend-path    PATH    /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/gnu/bin
prepend-path    PATH    /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/bin
prepend-path    LIBRARY_PATH    /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib
prepend-path    LD_RUN_PATH     /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib
prepend-path    CPATH   /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include
prepend-path    C_INCLUDE_PATH  /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include
prepend-path    INCLUDE /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/include
prepend-path    MANPATH /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/gnu/man
prepend-path    MANPATH /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/man
prepend-path    XDG_CONFIG_DIRS /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/etc/xdg
prepend-path    XDG_DATA_DIRS   /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/share
prepend-path    PERL5LIB        /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib/perl/site_perl
prepend-path    PKG_CONFIG_PATH /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/lib/pkgconfig
prepend-path    PKG_CONFIG_PATH /sw/env/gcc-8.3.0_openmpi-3.1.4/pkgsrc/2019Q1/share/pkgconfig

There's some redundancy (CPATH, C_INCLUDE_PATH, INCLUDE), because there
never was a true grand design of variable names that work with all
compilers, and this list is not complete for sure, but this is what we
needed so far to use the software from pkgsrc and also build software
that uses dependencies from within the pkgsrc prefix.

Between PATH, the dreaded LD_LIBRARY_PATH, PKG_CONFIG_PATH, MANPATH …
it rather obvious that you want such variables for locating libraries
and headers, too. I guess you could try to curate LDFLAGS, but it's
rather more complicated as this does not just contains a list of paths
but all sorts of flags.

I'm not aware of much use of environment modules and the less known
PATH-like variables outside of HPC. Multiuser shell servers are not
that common anymore. But until the cloud wholly absorbs HPC, replacing
shell sessions with VMs and containers, I wouldn't call the use
insignificant. It's a niche, of course, but being niche should be
nothing new to pkgsrc;-)

Oh, and regarding containers … of course we got environment modules for
container runtimes that people load before starting their containers
with yet another level of software environment.

> I was not even
> aware of that until your messages; I thought library search paths had to
> be specified with -L.

Live and learn;-) There's kinks like LD_RUN_PATH being killed once
there is just one -Wl,-R / -rpath in the flags, while LIBRARY_PATH is
combined with -L. That is why I provide a script that turns
LD_RUN_PATH's contents into a series of -L and -Wl,-R to use for
compiler invokations where other LDFLAGS could be in the game, too. But
even if you have to resort to explicit flags in the end, it is
beneficial to be able to handle the array of directories with
environment modules and to have the default search path just work out,
not forcing you to specify any LDFLAGS or CFLAGS to build “in this
environment”.

I'll try the hack with a multiarch subdirectory instead of lib64 now,
but still, I think it would be nice then pkgsrc would handle that
itself, integrating properly with the default search order on the
platform.


Alrighty then,

Thomas

PS: I'm not even attempting to build pkgsrc with the range of compilers
we offer on the system (at least Intel, Pgi (NVIDIA)). Those could have
peculiar search behaviour on their own, while at least Intel tries to
mimick gcc closely. You can tell it to behave like differing GCC
versions. It starts out complicated and just gets more complicated.

-- 
Dr. Thomas Orgis
HPC @ Universität Hamburg


Home | Main Index | Thread Index | Old Index