tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Deciding on wich variant(s) of OpenBLAS library to install



Am Tue, 20 Feb 2018 10:13:37 -0600
schrieb Jason Bacon <bacon4000%gmail.com@localhost>:

> On third thought, it might be a good idea to install both pthreads and 
> openmp libs, based on what you reported about known issues with one or 
> the other.  This would provide more flexibility for binary dependent 
> packages - they can use whichever one works best.

In our discussions about this around here, we also are picking up again
on the idea that BLAS/LAPACK should be runtime-interchangeable. This is
what Debian-based distros do at least with the serial version (see
[1]). They don't seem to have cblas on the radar, though. Or lapacke. I
would like to have a solution that covers them all. Fedora/Red Hat is
stuck in the planning phase [2] for a consistent approach since some
years now.

But generally, the notion that one should build packages so that they
link to generic

	libblas.so.x
	liblapack.so.x
	libcblas.so.x
	liblapacke.so.x

libraries is something we should also discuss here. Since especially
the choice of serial or parallel BLAS will depend on the application at
hand and also on what the user currently wishes to do (imagine an R/Python
user that has differing R/numpy scripts that demand
serial/parallel/pthreads/openmp openblas because of a combination with
multiprocess parallelization methods).

When we build all packages agains the reference netlib code, a
mechanism to select the actual libblas.so.x and friends to use at
runtime would enable the use cases without multiplying the needed
package builds. There is symlink management for the system
administrator, so that

	$PREFIX/lib/libblas.so.4 → $PREFIX/lib/blas-openblas-serial/libblas.so.4

as a site default after building openblas. This (or a variant of
switching ld.so.conf snippets) is what the alternatives system does in
Linux distros.

For our use case, we would also offer environment modules that prepend

	$PREFIX/lib/blas-openblas-serial

to LD_LIBRARY_PATH (a variable that we normally painfully avoid to
touch, these special runtime lib modules being the exception) and hence
give the user the run-time choice of which blas lib to present to the
applications.

This would mean that pkgsrc packages would all depend on the netlib
code only, that providing the C headers and fallback versions of the
libs. No added complexity of build configuration.

Do you think that is feasible and desirable in pkgsrc? We'd only need
to ensure that the version numbers of netlib and openblas, atlas match
so that the binaries are really interchangeable.

This approach would need some work writing libblas/liblapack/… wrapper
libraries for openblas (atlas) that have the proper SONAME set and
preferrably only export the symbols of the common API. Debian has done
this, apparently. This is a neat hack that still leaves people free to
link to the full libopenblas if they want to access API specific to
that.

>  I don't think it's 
> important to follow the libopenblasp naming convention from other 
> package managers. 

They seem to be oblivious on how to actually use that one, though. It
looks like it is not part of the usual concept of switching blas
implementations. It should be. It is just a variant of libblas.

> The main thing is providing a stable serial lib in 
> libopenblas.so and offering options for parallel libs to anyone willing 
> to deal with the complexities.
> 
> I'd leave out cblas.h and keep that in a separate cblas package.

Well … the cblas wrapper functions are also part of libopenblas. So the
conflict is always there, just ignored. I am also not quite sure if
cblas/lapacke is supposed to be compatible like the base Fortran libs
([3] … unclear).

It would be more manageable to have all of the netlib code in one
package, methinks. Version always in sync. Generally, I see a nightmare
trying to figure out which combinations of lapack A and blas B are
actually compatible (any lapack might work with atlas, but I am not
sure how much benefit one gets of building a lapack together with atlas
as that has _some_ of that API implemented?). The consensus, which
looks sensible to me, seems to be that one builds blas/lapack in some
combination and then uses those only together, not mixing with installs
from other sources. This would be reflected in the netlib stuff being
installed in

	$PREFIX/lib/blas-netlib/libblas.so.x
	$PREFIX/lib/blas-netlib/libcblas.so.x
	$PREFIX/lib/blas-netlib/liblapack.so.x
	$PREFIX/lib/blas-netlib/liblapacke.so.x

from one package … well, or at least from separate packages that are
painfully kept in sync in how they handle things.

I imagine the headers also being installed from that reference package
… well, maybe best also as symlinks to begin with.

	$PREFIX/include/cblas.h → $PREFIX/include/blas-netlib/cblas.h
	$PREFIX/include/lapacke.h → $PREFIX/include/blas-netlib/lapacke.h

> look at wip/plink, which uses cblas, blas, and lapack.  We should be 
> able to substitute openblas for blas and leave the rest as-is.

But openblas = blas + cblas + lapack + lapacke …

> I think it's a good idea to install pkgconfig and cmake files if it's 
> straightforward and doesn't cause any problems.

The rub is this, of course: They are strictly only relevant to one of
the openblas builds we install, if we install several. Several library
configurations likely will need differing pkgconfig/cmake files. At
least the library path to use needs to differ … or we have some search
path games just like for runtime-selection of the blas to use.

We might indeed get away with omitting all those extra files in favour
of simply installing the libraries and the specific headers into

	$PREFIX/include/openblas

(to be able to pull in openblas' cblas head after all, to get the full
interface including openblas_set_num_threads()). I am not yet sure how
far the headers will differ with different configurations. I am also
eyeing the possible option of I64 builds. So, we are really talking
about

	$PREFIX/lib/openblas-$variant
	$PREFIX/include/openblas-$variant

matching each other. One could do that with the pkgconfig files, too.
An environment module (or buildlink3.mk) could append the correct paths
to find those, too.

Generally, the approach of having stuff just link to generic
libblas.so.x seems more and more attractive to me. Assuming that Intel
keeps up being compatible with the GNU compilers, I can even switch
between openblas and MKL at runtime to compare things (see [4]).

I think we should start with a concept along those lines and implement
BLAS fun in pkgsrc The Right Way™, before we get into a mess with
packages depending on differing sets of BLAS libraries. I absolutely
want to avoid the situation where I might link something that uses
libsomeblas together with something that uses libotherblas … possibly
with conflicting parallel runtimes. Switching BLAS fully at runtime or
not at all … do I get an Amen? Or at least a good rebuttal? ;-)


Alrighty then,

Thomas

References:

[1] https://wiki.debian.org/DebianScience/LinearAlgebraLibraries
[2] https://fedoraproject.org/wiki/PackagingDrafts:BLAS_LAPACK
[3] http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=12&t=4330
[4] https://askubuntu.com/questions/891189/octave-4-2-1-and-intel-mkl/913029

PS: One nice thing for pkgsrc about the run-time choice is that the
only combination that we officially support and absolutely have to keep
working in any client application is the one with netlib. All other
choices and their pitfalls are up to the admin and user.

-- 
Dr. Thomas Orgis
Universität Hamburg
RRZ / Basis-Infrastruktur / HPC
Schlüterstr. 70
20146 Hamburg
Tel.: 040/42838 8826
Fax: 040/428 38 6270

Attachment: smime.p7s
Description: S/MIME cryptographic signature



Home | Main Index | Thread Index | Old Index