tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PATCH] math/fftw and math/fftwf OpenMP and MPI options



Am Wed, 15 Jul 2020 16:24:45 -0500
schrieb Jason Bacon <outpaddling%yahoo.com@localhost>: 

> Different MPI implementations have different feature sets and somewhat 
> different user interfaces

You speak of runtime, parameters and environment variables for
mpirun, right? The library API (and ABI) is a rather fixed standard
affair, AFAIK.

> like you developed for BLAS is worth attempting.  MPI is orders of 
> magnitude more complex.  Some software may work with one and not others, 
> and some users have personal preferences, or benefit from features that 
> are not universally supported (SLURM integration, CPU affinity, etc).

Especially the latter are reasons for us to install MPI as part of the
platform, along with a batch system (though we did decide not to couple
it with slurm via old-stype PMI … we will use a central libpmix for
future builds, though). If your goal is to have this all wrapped up
inside pkgsrc, for having a cluster of a BSD base image and just pkgsrc
on top for all other userspace software, batch service and all, then
sure, you'd like to wrangle that complexity inside pkgsrc.

I just wanted to warn you that I won't really appreciate your work as
it does not fit our software concept;-)

> 4. Installing a whole separate pkgsrc tree for each MPI implementation 
> would be a redundant time expenditure that I don't think is justified by 
> the modest effort needed to separate implementations within the same tree.

This can be understood with your wholesome view on pkgsrc. For us, it's
just a collection of libs and programs as part of our stack, not the
whole stack. And since we only install software relevant for the
cluster, many of our packages will be influenced by the toolchain
choice. There is generic stuff like gtk, Qt, some X11 libs that will be
built redundantly. But if that work really bothers us, we could submit
the builds as batch jobs. But they are not that long compared to the
manual labour needed to patch/fix things anyway.

Regarding wasted disk space. Yes, I do limit the number of versions and
differing stacks also to not have too much stuff lying around. But this
is mainly out of princible. Having a 500G or 5T NFS share for cluster
software stacks is a matter of choice, not technical limit.

The complexity you mention … you really need to check which usecase
really needs multiple variants of MPI and derived software. And
concerning redundancy … for example boost-libs. I add an MPI option
with my (still growing … hm, I though stuff got merged …) patch set

	http://src.rrz.uni-hamburg.de/extra/pkgsrc-patches/patches/

and with that, parallel stuff from Boost is built. Would you create
separate installs of boost-libs and everything that depends on it? Oh,
and of course we already install Boost Python bindings for various
versions of Python. Those contain bindings for Boost.MPI, too.
Would you try to separate out the MPI part and fix up builds to use the
MPI stuff from a different directory than the main bulk of boost-libs?

I see a nightmare matrix of prefixes. This is the exact problem we
solve with differing complete environments being installed in parallel
and managed via environment modules on HPC systems.

Especially with the complexities of MPI runtimes, doesn't it make sense
to settle on one variant that is used by pkgsrc packages? Is there any
open source package that wants MPI and does _not_ work with OpenMPI?
There is scientific software with specific requirements (we keep the
Pgi compiler for building Gaussian, for example, which is proprietary,
but comes in source form), but does any of that apply to pkgsrc
packages?

Of users care for one or the other, especially with
performance-relevant tuning better/worse in matching with their
platform at hand, wouldn't they select one MPI lib for pkgsrc and be
happy?

You have to consider if the goal is to support all packages being
built with all MPI choices. The choice of MPI to use might be
determined by the desired application set.

You have a definite example of package A really needing MPI X to
build/run and package B really needing MPI Y? And if so, same for
middleware packages opposed to leaf? I expect the discussion being
limited to software with licenses that allow integration in pkgsrc.

There can be a difference in MPI standard implementation, where one
implementation might lag the other. But in the long run, stable stuff
should be widely supported, at least to build the application
(performance can vary, of course).

> I would probably just set default prefixes for each implementation and 
> allow users who only use one implementation and want their MPI libs in 
> ${PREFIX}/lib to override with an mk variable, e.g. 
> PKGSRC_OPENMPI3_PREFIX. 

Would that also move boost-libs MPI support to $PREFIX/lib, along with
the MPI python packages to where they should reside? Parallel NetCDF? 

> There's no reason we have to impose one 
> iron-clad standard on everyone and waste time arguing about what it 
> should look like.

I just want to spare you/us unnecessary (IMHO) hassle. I imagine at lot
of work with little testing in the field. Speaking of which … I need to
get around testing the lapacke build …


Alrighty then,

Thomas

-- 
Dr. Thomas Orgis
HPC @ Universität Hamburg


Home | Main Index | Thread Index | Old Index