Re: Delay slots

To: coypu%SDF.ORG@localhost
Subject: Re: Delay slots
From: "Maciej W. Rozycki" <macro%linux-mips.org@localhost>
Date: Wed, 22 Jun 2016 01:42:58 +0100 (BST)

On Tue, 21 Jun 2016, coypu%SDF.ORG@localhost wrote:

> > As for "gas" and reordering, I have always viewed assembler reordering 
> > as a very large design error.  Assemblers should assemble; if I meant 
> > something different from what I wrote I would have written the other 
> > thing instead.  Reordering by programs belongs in compilers, not 
> > assemblers.  Note that GCC for years now has used gas in no-reorder 
> > mode for this exact reason -- gas does a horrible job, gcc knows far 
> > more about what should be done.
> > 
> I see your point now, it seems whoever wrote much of the MIPS code in
> NetBSD felt the same - there's set noreorder almost everywhere.

 While GAS's reordering may not be ideal as far as performance of code 
generated is concerned, stating that its job is horrible is I think unfair 
and creates FUD, which in turn makes people overuse the `noreorder' mode 
in handcoded assembly, which then breaks on one processor or another.  
I've seen this happen all too often.  You only really need the `noreorder' 
mode where you want to squeeze out every cycle and schedule a delay slot 
instruction that has a data dependency with the preceding jump or branch, 
e.g.:

	jalr	$4
	 move	$4, $2

or

	beq	$2, $3, foo
	 addiu	 $2, $2, 1

(delay-slot instructions indented by convention).

 Surely any semi-decent compiler will be better at scheduling useful 
instructions into delay slots, however GAS still gets the basic task of 
ensuring machine code correctness by filling delay slots in handcoded 
assembly right: it swaps branches and jumps which have a delay slot (not 
all do) with the preceding instruction if possible, and otherwise 
schedules a NOP into the delay slot or converts a jump to a compact form 
if there is one.  Similarly it schedules NOPs to fulfil data dependencies 
where the producer delivers its result late -- this in particular includes 
MIPS I memory loads, MIPS I-III coprocessor moves (including both CP0 and 
CP1/FPU), and various corner cases with the HI/LO accumulator registers.

 Consequently you don't have to handle any of this stuff in handcoded 
assembly, which is probably still somewhat better than having to have 
conditions sprinkled across your source to put NOPs in various places 
depending on what ISA level you assemble for.  Of course you can instead 
assume the worst and just put the maximum number of NOPs ever required 
everywhere, but then users of newer ISAs only will start demanding to drop 
support for older ISAs so that they don't lose performance and memory 
space for these extraneous (from their point of view) NOP fillers.

 Certainly there are complex hazards too where side effects are involved, 
such as with poking at the TLB, but that is never handled automatically, 
be it with the compiler or the assembler -- you need to handcode this 
stuff anyway (or switch to a modern MIPS ISA which has hazard barrier 
instructions such as EHB and JR.HB).

 So being able to support older ISAs with no pain for the newer ISAs is 
perhaps a worthwhile gain from the "very large design error".  Otherwise 
we probably wouldn't have modern software support anymore for legacy MIPS 
ISAs (anything MIPS IV or below) and consequently computers with those 
older processors.

 FWIW,

  Maciej

References:
- Delay slots
  - From: coypu
- Re: Delay slots
  - From: Paul_Koning
- Re: Delay slots
  - From: coypu
- Re: Delay slots
  - From: Paul_Koning
- Re: Delay slots
  - From: coypu

Prev by Date: Re: Delay slots
Next by Date: Re: Delay slots
Previous by Thread: Re: Delay slots
Next by Thread: Re: Delay slots
Indexes:

Home | Main Index | Thread Index | Old Index