tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Deciding on wich variant(s) of OpenBLAS library to install



Am Sun, 11 Mar 2018 20:00:14 -0400
schrieb Greg Troxel <gdt%lexort.com@localhost>:

> (I've been not quite following, so take this with a grain of salt.)

Obviously;-) I'll try to give an executive summary in my reply.

> The normal approach in pkgsrc is to set a system-wide variable to
> specify the implementation, and thus to install only one of the
> alternatives.  We do this for pgsql and kerberos.

That is along the lines my proposal. Have a global BLAS (and LAPACK,
perhaps rolled into a single variable) choice that prompts installation
of one of the libraries and makes packages use it.

> If the ABI of the variants is really the same, then one can flip the
> variable and "make replace OLDNAME=" with the new openblas.  Or just
> pkg_delete -f and install the ne one.

The ABI is the same, as it follows an established standard that has
several implementations. There are specialties like using OpenMP or
phthreads under the hood. But the ABI symbols are the same plus some
non-standard extensions (OpenBLAS function for setting thread number).
Basically everything should be able to build with any variant of BLAS,
but unless we have extra magic that hides the non-standard symbols
(like Debian's libblas.so wrappers, AFAIK), things will need a rebuild
to ensure that the new BLAS is correctly used.

> If they are made parallel installable, sort of like guile20 and guile22,
> then there can be a per-package variable to link against various
> versions.

So, a normal thing would be to add BLAS_DEFAULT and possibly
BLAS_ACCEPTED to packages if we determine that some are incompatible
with differing BLAS? I only imagine that being the case  when we did
not yet get around patching a build script that has -lblas hardcoded
where we want to insert -lopenblas.

I wonder how much logic of the KRB5 stuff applies here. Can packages
override KRB5_DEFAULT by setting KRB5_TYPE? But not in mk.conf
per-package, right?

As the libraries are not conflicting, per the last agreement Jason and
I had, things can be installed in parallel, with differing names of the
libraries, but the same symbols.

> However, if there is a library that links against OpenBLAS,
> and then something else links against that, […]

Yes. This is the mess I want to avoid. I would like all pkgsrc packages
using one BLAS, but still see value in providing the others to users to
use in their own programs. We install pkgsrc as a platform for the
actual end-user applications. Some of them are part of pkgsrc, some are
just the code written by scientists themselves. If pkgsrc uses serial
OpenBLAS, they should be able to knowingly link against the
OpenMP-parallel version and vice versa. They can even do LD_PRELOAD and
run pkgsrc applications with a different BLAS lib from pkgsrc. The user
shall have all power.

> An alternative approach is to bootstrap pkgsrc multiple time, with a

We do that already for differing toolchains. I install
gcc+binutils+MPI, or take the package from Intel, then pkgsrc on top
with those specfic tools. Using Intel MKL as external BLAS would also
be part of the plan, possibly. Like I am using external MPI already. I
am using this opportunity to (re-)send two of my small local patches
for this. At least the OpenMPI one has some similarity in spirit to the
current discussion. It's also an industry standard with several
implementations. It even specifies the mpicc/mpif90 wrappers for
linking.

And yes, if one wants a pkgsrc with serial BLAS, one with parallel, one
with unoptimized, separately bootstrapped prefixes are an option. This
is in agreement with the route the EasyBuild framework followed,

	http://easybuild.readthedocs.io/en/latest/Common-toolchains.html#common-toolchains

(this is what emerges in the gap that pkgsrc left by not providing any
scientific application on earth yet;-). Actually, my use of pkgsrc on
top of my custom toolchains goes a long way towards what the EasyBuild
folks also want to achieve. Initially, I just wanted pkgsrc to provide
runtime-switchable trees of the annoying infrastructure stuff with lots
of further dependencies.

> Without understanding the details, it seems that it is basically a bug
> that multiple variants exist.  But I realize coping with bugs is
> sometimes easier than fixing them.

It's not a bug, it's a feature;-) Different BLAS version exist because
they are optimized to different degrees on differing hardware.
Historically, a vendor of a HPC system delivered a compiler, MPI, and BLAS
that were all tuned to the machine at hand. Scientists came along and
built their code using these provided vendor tools to get optimal
performance.

Machines are a lot less special now, so that you have a vast majority
of HPC installations running more general-purpose BLAS implementations.
There is still special stuff around in the network department, that is
why you will use Cray's MPI implementation when on a Cray system. But
also on a bog-standard cluster, you can buy the Intel compilers with
the MKL, which is usually a bit ahead in performance compared to a GCC
with a free BLAS implementation. The gap closed considerably with newer
GCCs and with OpenBLAS being tuned for farily recent Intel CPUs. I am
not sure what the best is on AMD CPUs. Or ARM. Or MIPS (Longsoon).
There still is the approach of ATLAS to build an optimized BLAS for any
machine by measuring variants during build, but it is outperformed by
the hand-tuned code of OpenBLAS. Performance is the reason to make
the software (more) messy.

If you don't care about the computing time and just about the resulting
numbers, the plain Fortran BLAS from netlib (math/blas) can be all
you need. But many scientific applications at the core to matrix
computations that can be sped up by large factors by using optimized
code (like, an R benchmark running 7 times as fast with OpenBLAS than
with math/blas … and that is without parallelization).

> I see in wip openblas-devel and OpenBLAS, and they seem to be the same
> version.  Where  are the various versions being considered to be
> switched?

Jason and I were discussing on which one of the two to use as a
starting point for math/openblas. Wip/openblas-devel is the FreeBSD
approach with differing variants of the OpenBLAS libraries being
installed next to each other (serial, parallel-pthread/openmp).
Wip/OpenBLAS is the package I currently use, with a build-time choice
on exactly one variant to install (plus auxilliary files that only make
sense when they match that variant).

The other BLAS implementations are math/blas (with math/lapack) and
wip/atlas.


Alrighty then,

Thomas

-- 
Dr. Thomas Orgis
Universität Hamburg
RRZ / Basis-Infrastruktur / HPC
Schlüterstr. 70
20146 Hamburg
Tel.: 040/42838 8826
Fax: 040/428 38 6270
--- pkgsrc-2016Q4/mk/mpi.buildlink3.mk.orig	2017-02-07 15:03:27.980829557 +0100
+++ pkgsrc-2016Q4/mk/mpi.buildlink3.mk	2017-02-07 15:08:08.816288251 +0100
@@ -9,7 +9,7 @@
 # MPI_TYPE
 #	This value represents the type of MPI we wish to use on the system.
 #
-#	Possible: mpich, openmpi
+#	Possible: mpich, openmpi, native
 #	Default: mpich
 
 .if !defined(MPI_BUILDLINK3_MK)
@@ -21,10 +21,13 @@
 .if exists($(LOCALBASE)/bin/mpicc)
 _MPI_PACKAGE!=	${PKG_INFO} -Q PKGPATH -F ${LOCALBASE}/bin/mpicc
 MPI_TYPE?=	${_MPI_PACKAGE:T}
-.else
+.endif
 
 MPI_TYPE?=	mpich	# default to MPICH due to backward compatibility
-.if $(MPI_TYPE) == "mpich"
+
+.if $(MPI_TYPE) == "native"
+# nothing
+.elif $(MPI_TYPE) == "mpich"
 _MPI_PACKAGE=	parallel/mpi-ch
 .elif $(MPI_TYPE) == "openmpi"
 _MPI_PACKAGE=	parallel/openmpi
@@ -32,8 +35,9 @@
 PKG_FAIL_REASON+=	\
 	"${MPI_TYPE} is not an acceptable MPI type for ${PKGNAME}."
 .endif
-.endif
 
+.if defined(_MPI_PACKAGE)
 .include "../../$(_MPI_PACKAGE)/buildlink3.mk"
+.endif
 
 .endif	# MPI_BUILDLINK3_MK
--- pkgsrc/mk/wrapper/cmd-sink-icc-cc.orig	2015-09-11 10:01:55.878771711 +0200
+++ pkgsrc/mk/wrapper/cmd-sink-icc-cc	2015-09-11 10:03:29.557766286 +0200
@@ -39,7 +39,10 @@
 # icc provided libraries. use the static linking method so binary
 # packages can be used on systems that do not have these libraries
 # available.
-arg=-static-libcxa
+# ThOr: Nope, we take care of library paths here, thank you.
+# Also, this seems to break as early as building perl, which doesn't
+# use any libcxa, but still gets non-PIC binaries where they don't fit.
+arg=
 $debug_log $wrapperlog "    (cmd-sink-icc-cc) pop:  $arg"
 . $buildcmd
 

Attachment: smime.p7s
Description: S/MIME cryptographic signature



Home | Main Index | Thread Index | Old Index