NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: toolchain/47503: Request automated addition of gfortran to base compiler set



On 02/04/2013 15:20, Mark Davies wrote:
The following reply was made to PR toolchain/47503; it has been noted by GNATS.

From: Mark Davies<mark%ecs.vuw.ac.nz@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: toolchain/47503: Request automated addition of gfortran to base 
compiler set
Date: Tue, 5 Feb 2013 10:15:29 +1300

  On Saturday 26 January 2013 04:05:03 you wrote:
  >   I would agree that it *should* be fixed, but it's an awfully squirrelly
  >   problem and I think it would be much easier and much safer to build
  >   gfortran as part of the base.
  >
  >   I've been wrangling with these issues for over a year and a complete
  >   base compiler suite seems like the only clean, safe solution.

  By default pkgsrc uses g95 as its fortran in the situation that gcc is the
  base compiler but no fortran is provided.  Does this not work for you?
  g95 builds and works fine for me on ArchLinux as well as NetBSD.

Hello Mark,

Thanks for following up.

Let me fill you in on where I've been and where I hope to go with this. Sorry for the novel, but I want to make sure my motivations are clear.

I support high performance computing at UW - Milwaukee, including a fairly large RHEL cluster, as well as some open source workstations and Macs.

HPC clusters are dominated by CentOS and RHEL, due to the need to run commercial software like Abaqus and ANSYS. They have very poor support for running open source applications, though. The RPMs in the Yum repository tend to be outdated due to Redhat's need to maintain binary compatibility for older closed-source applications. The Yum repo is rather small anyway, and doesn't contain much to support scientific computing.

At most HPC sites, they do cave-man installations of most of the open source software and use "modules" (http://modules.sourceforge.net/) to control PATH and other environment variables in order to enable individual applications.

I see a huge potential for pkgsrc to improve this situation. It has the potential to:

a) Save countless man-hours that are currently being wasted on manual installations. b) Enable the use of software that might not otherwise be explored because it's too difficult for a physicist, biologist, or engineer to install. c) Allow researchers to easily have the same software installed on their Mac, open source desktop, and the cluster. This is going to be a key selling point for pkgsrc in research computing.

However, there are some basic foundational issues like Fortran support that need to be solidified across platforms first.

G95 is based on a very old (actually experimental) version of the gfortran compiler. Fortran support in GCC didn't really mature until around 4.4. Earlier versions are generally untrusted in the HPC community, and a lot of code is still being built with commercial compilers. Some recent versions of software won't even compile under G95.

Also, for HPC, a good optimizer is indispensable, since it can save thousands of CPU-hours on a busy cluster. GCC made huge improvements to the optimizer between 4.2 and 4.5, and most HPC sites are now running 4.4 or later.

If we're going to use a package to support Fortran, it would have to be a recent GCC package, not G95.

We could get by with a GCC package on NetBSD, but since GCC 4.5 is now being used as the base compiler, and pkgsrc already contains the logic to use a base gfortran if it's present, I think it would make sense to have gfortran in the base (at least as an option). This would ensure object code compatibility with code compiled by the base gcc and g++ as well. I think it would take fewer man-hours in the long run to put it in the base, and the end result would be cleaner and safer. I've used GCC ports on FreeBSD for a long time and it mostly works fine, but I have run into a few issues due to mixing gcc versions.

RHEL and CentOS already include gfortran in the base, so a GCC package is unnecessary here.

OS X presents a different problem. There, the base compiler is either GCC 4.2 or earlier, or clang. Even if we had working GCC packages on OS X, using it would guarantee that we're mixing tool chains.

At best, we'd be using an old GCC on Snow Leopard and earlier for most of the packages, which doesn't optimize nearly as well as later GCC.

Hence, a gfortran package that works on OS X isn't of much interest, and we don't need one at all on CentOS and RHEL.

I'm developing a solution for OS X that installs the GCC 4.6 collection outside of pkgsrc, from which pkgsrc can be bootstrapped. That way, everything in pkgsrc (C, C++, and Fortran) will be built with a modern compiler from the same build, and will benefit from a good optimizer, which is what the HPC community requires.

We're pretty close to having a solid foundation for scientific packages on NetBSD, CentOS 6, and OS X. I have a lot of packages in wip already, but we need to clean up a few more things before we start committing most of them.

Once we get a few HPC packages working well (including some with OpenMPI), I plan to start promoting pkgsrc to the research computing community via our website and presentations at HPC conferences.

From there, I think things could really take off. We'd be able to recruit more Linux and Mac users to become packagers, as well as promote NetBSD as a research computing platform.

Thanks again.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jason W. Bacon
jwbacon%tds.net@localhost
http://personalpages.tds.net/~jwbacon
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~




Home | Main Index | Thread Index | Old Index