tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Deciding on wich variant(s) of OpenBLAS library to install



Am Tue, 27 Feb 2018 14:48:00 -0600
schrieb Jason Bacon <outpaddling%yahoo.com@localhost>: 

> Currently, there aren't many finished packages linking against blas, and 
> most of them do not install libraries, so this situation is going to be 
> rare in the short term:

Well, I see most of them as libraries;-)

> netbsddev# pkg-grep blas/build|grep -v wip
> biology/mpqc/Makefile:.include "../../math/blas/buildlink3.mk"
> biology/plink/Makefile:.include "../../math/blas/buildlink3.mk"
> lang/lush/Makefile:.include "../../math/blas/buildlink3.mk"
> math/R-RandomFields/Makefile:.include "../../math/blas/buildlink3.mk"
> math/R-RcppEigen/Makefile:.include "../../math/blas/buildlink3.mk"
> math/R-gstat/Makefile:.include "../../math/blas/buildlink3.mk"
> math/R-quantreg/Makefile:.include "../../math/blas/buildlink3.mk"
> math/R-wle/Makefile:.include "../../math/blas/buildlink3.mk"
> math/R/Makefile:.include "../../math/blas/buildlink3.mk"

R is a set of libraries with an interpreter on top. Users build R
extensions/modules that link against R libs. It is really mandatory
that _at_least_ all R-related packages use the same BLAS! An individual
choice per package would be an immediate mess.

> math/harminv/Makefile:.include "../../math/blas/buildlink3.mk"
> math/ipopt/Makefile:.include "../../math/blas/buildlink3.mk"
> math/itpp/Makefile:.include "../../math/blas/buildlink3.mk"
> math/lapack/Makefile:.include "../../math/blas/buildlink3.mk"

Well, lapack is obviously a library;-) If you have something using
lapack and also directly using BLAS functions, it better link to the
same BLAS as lapack.

> math/octave/Makefile:.include "../../math/blas/buildlink3.mk"

Octave seems to be more monolithic than R, but I guess the argument
about people building extensions (just like Matlab) still applies. If
you add a binding to a C/Fortran library that in turn uses BLAS behind
the scenes, you get a mix-up with the BLAS that Octave directly links
to.

> math/py-numpy/Makefile:.include "../../math/blas/buildlink3.mk"
> math/py-scikit-learn/Makefile:.include "../../math/blas/buildlink3.mk"

Yeah, consider py-numpy and py-scikit-learn _not_ using the same BLAS.
I am not sure how loading of these modules (eggs?) in one python
process works … do they get separate address spaces? Dlopen and manual
symbol mapping? I guess they also could overwrite each other's BLAS,
depening on who comes first.

> I think it would be sufficient to flag the potential for this situation 
> somehow, 

In your list, I see lots of potential in nearly every entry:-/ The big
change we are discussing is not exchanging the BLAS implementation, but
ending the situation where everyone uses the same one.

> On the whole, I still feel like the best approach for the short-term is 
> for ${PREFIX}/lib/lib{blas,lapack} to be the netlib implementation (the 
> only one that's well tested under pkgsrc right now, as atlas, cblas, and 
> openblas exist only in wip)

Note: cblas is _not_ a BLAS implementation. It is a C wrapper library
over a BLAS implementation. Openblas does also provide cblas if you
don't forbid it to. This is another level of decision, the
(de-)coupling of cblas/lapack/lapacke (from)to blas lib. This is easier
when we decide on one global choice of linear algebra toolchain being
used for dependent packages. We can then use one combination and stick
to it. If it's all free to decide per package, the mess is inevitable.

>, and every other implementation to be 
> installed under a different name or different path 

I agree with that. We can do that, make them all installable next to
each other. That would probably mean NO_CBLAS=1 for openblas, maybe
something similar with atlas to avoid conflict with the cblas package.

But I really want to convince you that we should go from

> biology/mpqc/Makefile:.include "../../math/blas/buildlink3.mk"
> biology/plink/Makefile:.include "../../math/blas/buildlink3.mk"

to

biology/mpqc/Makefile:.include "../../mk/blas.buildlink3.mk"
biology/plink/Makefile:.include "../../mk/blas.buildlink3.mk"

just like mpi.buildlink3.mk. In fact … there are parallels to MPI.
Would you also rather decide per-pacakge which MPI to use? If you
really are sure that a certain package only uses BLAS itself without
linking to anything else that does and does offer no libs to link to,
you can add set BLAS.package=anotherone to override a global
BLAS=openblas-serial, right? I just advise against doing this on a
packager level, as it makes the job harder for an admin who wants to
ensure consistent BLAS.

> As we (all maintainers of dependent packages) gain more experience with 
> the various blas implementations and work out the bugs in the blas 
> packages,

Well, let's keep netlib blas as default for the forseeable future. No
additional testing. Just moving a choice into the blas.buildling3.mk
and lapack.buildlink3.mk. Those who add openblas/atlas to those will
test those, and users who dare. At some point one can flip the
switch on the default implementation if there is enough confidence on a
certain platform.

I really, really just want to avoid us entering a dependency hell where
there is a mix-up of use of differing libraries with the same symbols
in packages. I want the differing BLAS availabe for users in their own
programs, but I will document in the environment module help that they
need to consider any package from pkgsrc being linked against the
site-default BLAS, with potential surprises if they choose a different
one in their programs using libs from pkgsrc.


Alrighty then,

Thomas

-- 
Dr. Thomas Orgis
Universität Hamburg
RRZ / Basisinfrastruktur / HPC
Schlüterstr. 70
20146 Hamburg
Tel.: 040/42838 8826
Fax: 040/428 38 6270

Attachment: smime.p7s
Description: S/MIME cryptographic signature



Home | Main Index | Thread Index | Old Index