Subject: make(1) and .WAIT and .ORDER
To: None <tech-userlevel@netbsd.org>
From: David Laight <david@l8s.co.uk>
List: tech-userlevel
Date: 11/13/2006 21:54:52
make(1) was recently 'updated' to implement a recursive check in the .WAIT
code so that none of the children of the RHS are build before the LHS in
a parallel make (ie -jn)
So given:

    a: b .WAIT c
    c: c1

then 'c1' isn't built until 'b' has been built (previously 'c1' would
get built and only the build of 'c' would be deferred until 'b' was built).

A lot of the NetBSD build makefiles assumed the new behaviour...

The problem is that this was implemented by scanning the tree to add
normal dependencies for all the child nodes (eg c1) against all the nodes
to the left of the .WAIT.  For certain makefiles (eg src/tools/Makefile)
this generates a considerable number of dependency checks - over 1000000.

The code also leaves .ORDER as only applying to the commands of the file
itself, and not its children.
In non-trivial makefiles (eg where targets exist as macros that group
other targets together) and with deep dependency relations both .WAIT
and .ORDER are most useful if they stop the children (aka dependencies)
being built as well as the file itself.

I've modified make and completly reworked the way that jobs are scheduled
so that both .WAIT and .ORDER recursively apply, but without requiring
more than a simple 'visit each node once' scan of the tree to set it up.

An additional benefit is that 'make -j1' now builds things in (much the)
same order as a non-parallel make.

The new code process the main target left to right, replacing nodes with
unmade children with the child list, so given (and assuming the files
all need building, and there are commands...):

    a: b d
    b: c1 c2

to build 'a', we first put 'a' onto the list to build.
Then we remove the head 'a', find it has unmade children, to stuff them into
the list before the current head - giving 'b' 'd'.
Remove 'b', add its children giving 'c1' 'c2 'd'
Remove 'c1', build it, decrement count of unmade children on every parent,
none go to zero.
remove 'c2', build, unmade child count of 'b' becomes 0, add 'b' to list.
....

For a parallel make job start and finish is asynchronous, but the mechanism
above works fine.

Implementing .WAIT is a SMOP, a .WAIT node is added into the list, and
dependencies added to everything to its left.  An addition nothing on its RHS
is added to the work list until the .WAIT node itself is build (when all its
parents get rescheduled in order to follow the child list past the (now made)
.WAIT node.  so:

    a: b c .WAIT_1 d .WAIT_2 e

effectively has hidden dependencies:
    .WAIT_1: b c
    .WAIT_2: .WAIT_1 d

Thus .WAIT terms only affect the dependecy list on which they appear.
For instance:
    a: a1 a2
    a1: b .WAIT c
    a2: c .WAIT b
won't end up in a deadlock - indeed 'b' and 'c' are likely to be built at
the same time.  (Indeed it is impossible to generate a deadlock with .WAIT)

.ORDER processing is a little trickier.  '.ORDER a b' has to cause 'a' to
be build before 'b' wherever they appear in the tree. It also only applies
if both 'a' and 'b' are to be built (so is different from 'b: a').
This requires an extra set of dependecies between the nodes.

I've implemented .ORDER by checking the order predecessors at the time
a node is removed from the work list, if any are unmade (and need to be made)
then the node isn't build, and its children aren't added.  When one of the
predecessor nodes gets build, the dropped (now sucessor) node is rechecked.

This has the side effect that:
    a: b
    .ORDER b a
is a deadlock - which might not, at first glance, be invalid syntax.

Non-recursive .ORDER could be implemented, but is really less useful.

The modified make seems to build NetBSD (after a minor change to the
perversities in src/tools/makefile - which have never done what the
comments suggest they do!).

Any comments/thought before I commit the changes ?

	David

-- 
David Laight: david@l8s.co.uk