Re: Moving VAX into 21 century :-)

To: Johnny Billquist <bqt%update.uu.se@localhost>, port-vax List <port-vax%NetBSD.org@localhost>
Subject: Re: Moving VAX into 21 century :-)
From: Anders Magnusson <ragge%ludd.ltu.se@localhost>
Date: Tue, 27 Aug 2019 14:56:17 +0200

Morning Johnny,

Den 2019-08-27 kl. 00:40, skrev Johnny Billquist:

A couple of comments without having caught up where the thread is now...

On 2019-08-26 14:08, Anders Magnusson wrote:
Hi all,
I have been looking at some VAX problems lately, and have found outthat there are two architectural things things that probably wouldhelp VAX quite much.
1) Change calling convention.
As described in my previous mail, it would solve a very oldwell-known performance problem.
It is indeed well known that the CALL/RET instructions on the VAX arevery heavy. However, it is not that they are heavy for no reason.I can't really see that the JSB/RSB would be much better, unless weactually think that we want to strip some of that functionality away.
Things that CALL do:
Push requested register on stack at entry, and automatically restoresthem again at return.
Saves AP and sets up new AP.
Saves and sets up new FP.
Pushing and popping registers will need to be done by compiler if notusing CALL/RET. Should not have any penalty on speed, but will growmemory needs a little, I would expect.Setting up AP - well, you might have some clever convetions in mind,but I would expect similar effort to CALL would be needed for JSB.Setting up FP - if we don't want tracebacks and returning withoutfirst cleaning up the stack to work, then this can save some time. Butis this really something we'd like?
I wonder how much gain there really is, if you still want all thebells that CALL gives you? I would expect the end cost to come outabout the same, but with more memory required.

In the common case we don't need any of the extra stuff that CALLS does,this is why jsb/rsb can be used instead.- Pass parameters in registers (we have a bunch of them). This avoidsmemory cycles as well which is good.

- No need for AP (Use for TLS?)
- No need for FP (unless we are playing with VLAs which is quite uncommon)

- No need to save PSL or align stack. Keeping stack aligned is up tothe compiler.- Keep a "red zone" below stack of 8 words or so to simplify for leaffunctions. Amd64 ABI does this as well.


It won't increase the code noticeable;
- CALLS is (usually) 9 bytes (7 + the word in the function)
- JSB is 6 bytes.
- PUSHR/POPR takes 4 bytes.

So if no regs needs saving we save 3 bytes, otherwise we add 5.
Also we save three bytes if we only are inside the red zone.

2) Make VAX use IEEE floats :-)
Today virtually no floating point exist that is not IEEE. Theonly fragment around is probably the VAX floats.
This can hardly be a performance problem. So now we're talking aboutsome compatibility or general behavior thing?But FP on VAX do have differences, in the hardware, that we cannotpretend don't exist. So is this about stupid programs that are makingsome assumptions that we fail, while still actually not caring enoughabout actual IEEE FP, or do we really want to be proper IEEE FP, inwhich case we're going to need to emulate in software, which will havea huge impact on performance.

We're talking about being able to compile and run existing programswithout getting a headache.

If we just want to fake it, we might just want to lie, andshortcircuit a couple of obvious differences. And then run with it.
I have done some checking, and if we accept the difference inrounding (VAX uses a different way than IEEE) then it would be(almost) no overhead in the common cases (overhead comes when dealingwith INF, NAN and subnormals). - Use F and G floats. They have the same format as IEEE singleand double, and are both available on virtually all VAXen. - Make use of the floating point faults that VAXen can generateto emulate the features missing on VAX.
So essentially just trying to fake it?
I seem to remember there are some differences about non-vanishingvalues as well, but my brain is fuzzy...

The largest difference will be the rounding, since it works different onVAX and (default) IEEE, and it's not worth trying to fix that.(OTOH, most HW implementations have problems as well, like the(in-)famous x87 :-)

...in theory also H floats could be used as long double sincethey match the IEEE 128-bit quad precision :-) But since H float is optional it might end up being emulated(maybe not a big problem?)
Comments on this?
Not sure if it's a good idea or not. Not saying for sure it should notbe done, but I'd like to better understand what gains we will have,and what we might be sacrificing.

Calling convention:
    + Speed
    - Require update of toolchain

IEEE Floating point:

+ Compatibility with rest of the world. Simple to compile programs.Possible to use otherwise unusable programs. - Require update of toolchain. Will not support IEEE rounding(would be too slow then).


Not to mention, how come the number of calls have blown up so much?

Much more code, and much more modular code.

Most of the code that was written like 30 years ago on VAX avoidedfunction calls due to their slowliness, there are many comments aboutthat in old code. This is not the case at all anymore.


-- R

Follow-Ups:
- Re: Moving VAX into 21 century :-)
  - From: Mouse
- Re: Moving VAX into 21 century :-)
  - From: Johnny Billquist
- Re: Moving VAX into 21 century :-)
  - From: Paul Koning

References:
- Moving VAX into 21 century :-)
  - From: Anders Magnusson
- Re: Moving VAX into 21 century :-)
  - From: Johnny Billquist

Prev by Date: Re: VAX backend
Next by Date: Re: Moving VAX into 21 century :-)
Previous by Thread: Re: Moving VAX into 21 century :-)
Next by Thread: Re: Moving VAX into 21 century :-)
Indexes:

Home | Main Index | Thread Index | Old Index