tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Allow 'native' option to mk/mpi.buildlink3.mk to use the system's MPI implementation



Am Mon, 10 May 2021 07:53:18 -0400
schrieb Greg Troxel <gdt%lexort.com@localhost>: 

> I also don't follow the no "builtin" logic, if you end up using something
> from base that could have been from pkgsrc.

This is the case for our own builds of openmpi, but not for builds of
MPI libraries shipped with commercial compilers (Intel, PGI). They may
be based on mpich or openmpi, but from our perspective it is ‘the MPI
that came with that compiler’. It might be a specific build shipped
together with the hardware, built to support a certain interconnect.

> Rather than rejecting
> builtin, it seems that the problem is perhaps instead that the way
> variables are set by bl3 and then used by depending packages is not rich
> enough to capture the different ways one is supposed to build.

That is the beauty of it … or lack of beauty, however you put it. MPI
is an old standard, like BLAS, that has differing implementations of
the same API (versioned, so differences arise with time). But it is
more elaborate in that it is not simply a single library that you can
switch out, but possibly a set of them, including stuff that includes
certain hardware support. Because of that the standard way to build MPI
applications is using wrapper compilers. These can be asked about which
libraries they would use and some elaborate builds try to incorporate
that, but normally, you are supposed to let them do the setup.

Examples for differing MPI and toolchain choices on our system:

PGI compiler with shipped MPI …

$ module switch env env/pgi-19.7_openmpi-3.1.3
$ mpicc -O -o hello.pgi hello.c

… Intel MPI …

$ module switch env env/cuda-9.0.176_intel-17.0.5_impi
$ mpicc -O -o hello.intel hello.c 

… GCC, OpenMPI …

$ module switch env env/2020Q3-gcc-openmpi
$ mpicc -O -o hello.ompi hello.c 


I hope you notice a pattern here;-) There are specific wrappers for
some implementations and sometimes custom environment variables to
choose the actual compiler, but the idea is that this just wraps over
$CC with some custom linker flags. This is a long-established standard.

It is also customary for the MPI implementation to ship an ‘mpirun’
command for exection (there used to be agnostic implementations that
tried to figure out the right thing, but nowadays, you use the mpirun
that matches the mpicc you used).

$ mpirun hello.ompi 
hello world from processor node002, rank 3 out of 16
hello world from processor node002, rank 7 out of 16
hello world from processor node002, rank 9 out of 16
hello world from processor node002, rank 11 out of 16
hello world from processor node002, rank 13 out of 16
hello world from processor node002, rank 4 out of 16
hello world from processor node002, rank 5 out of 16
hello world from processor node002, rank 0 out of 16
hello world from processor node002, rank 1 out of 16
hello world from processor node002, rank 12 out of 16
hello world from processor node002, rank 15 out of 16
hello world from processor node002, rank 2 out of 16
hello world from processor node002, rank 6 out of 16
hello world from processor node002, rank 8 out of 16
hello world from processor node002, rank 10 out of 16
hello world from processor node002, rank 14 out of 16


So … everything is handled wia the mpi* wrapper commands in PATH and
the API in the header. The names of these commands are common.
Applications in general do not care which MPI they get. In the past,
‘native’ was the only one sensibly available and it is still the case
for things like Cray systems with their Aries interconnect.

> Is your external MPI something that could be packaged, and then have a
> builtin for it?

Not in general. For a plain desktop/server system, you'll just use
mpich or openmpi from pkgsrc. On a dedicated HPC cluster with a
high-speed interconnect, you want to have MPI built for that setup in
one place and have everyone use it. We're already rather generic with
Infiniband, so can do our own build of openmpi, but there's yet another
angle with CUDA, which can also be mangled with MPI … ideally, GPUs
talk to each other via RDMA without involving the CPU much, mediated by
MPI.


Alrighty then,

Thomas

-- 
Dr. Thomas Orgis
HPC @ Universität Hamburg
// Author: Wes Kendall
// Copyright 2011 www.mpitutorial.com
// This code is provided freely with the tutorials on mpitutorial.com. Feel
// free to modify it for your own use. Any distribution of the code must
// either provide a link to www.mpitutorial.com or keep this header in tact.
//
// An intro MPI hello world program that uses MPI_Init, MPI_Comm_size,
// MPI_Comm_rank, MPI_Finalize, and MPI_Get_processor_name.
//
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
  // Initialize the MPI environment. The two arguments to MPI Init are not
  // currently used by MPI implementations, but are there in case future
  // implementations might need the arguments.
  MPI_Init(NULL, NULL);

  // Get the number of processes
  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &world_size);

  // Get the rank of the process
  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

  // Get the name of the processor
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processor_name, &name_len);

  // Print off a hello world message
  printf("hello world from processor %s, rank %d out of %d\n",
         processor_name, world_rank, world_size);

  // Finalize the MPI environment. No more MPI calls can be made after this
  MPI_Finalize();
}


Home | Main Index | Thread Index | Old Index