Port-vax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Moving VAX into 21 century :-)



On 2019-08-27 14:56, Anders Magnusson wrote:
Morning Johnny,

Good morning... :-)

Den 2019-08-27 kl. 00:40, skrev Johnny Billquist:
A couple of comments without having caught up where the thread is now...

On 2019-08-26 14:08, Anders Magnusson wrote:
Hi all,

I have been looking at some VAX problems lately, and have found out that there are two architectural things things that probably would help VAX quite much.

1) Change calling convention.
     As described in my previous mail, it would solve a very old well-known performance problem.

It is indeed well known that the CALL/RET instructions on the VAX are very heavy. However, it is not that they are heavy for no reason. I can't really see that the JSB/RSB would be much better, unless we actually think that we want to strip some of that functionality away.

Things that CALL do:
Push requested register on stack at entry, and automatically restores them again at return.
Saves AP and sets up new AP.
Saves and sets up new FP.

Pushing and popping registers will need to be done by compiler if not using CALL/RET. Should not have any penalty on speed, but will grow memory needs a little, I would expect. Setting up AP - well, you might have some clever convetions in mind, but I would expect similar effort to CALL would be needed for JSB. Setting up FP - if we don't want tracebacks and returning without first cleaning up the stack to work, then this can save some time. But is this really something we'd like?

I wonder how much gain there really is, if you still want all the bells that CALL gives you? I would expect the end cost to come out about the same, but with more memory required.
In the common case we don't need any of the extra stuff that CALLS does, this is why jsb/rsb can be used instead. - Pass parameters in registers (we have a bunch of them).  This avoids memory cycles as well which is good.

True. Passing parameters in registers are definitely an option. Will require saving and restoring registers at call. And it might be messy in how to deal with different types of parameters. So it would increase complexity in possibly several ways. But should enable faster execution.

- No need for AP (Use for TLS?)

You need some way of telling where the arguments are for arguments beyond what you can pass in registers. Are you suggesting just some fixed offset on the stack? This can become ugly and error prone...

- No need for FP (unless we are playing with VLAs which is quite uncommon)

The FP is mainly used for callback tracing, and automatic cleanup of the stack. I think in general that is a nice thing, but it does also cost, yes.

- No need to save PSL or align stack.  Keeping stack aligned is up to the compiler.

I can't remember. Does CALL really align the stack? How is that then handled at return? Does it realign back to whatever it was previously?

- Keep a "red zone" below stack of 8 words or so to simplify for leaf functions.  Amd64 ABI does this as well.

You mean preallocate some space on the stack? Sure. Don't cost anything, and could already be done today. Not sure how much speed it saves.

It won't increase the code noticeable;
- CALLS is (usually) 9 bytes (7 + the word in the function)
- JSB is 6 bytes.
- PUSHR/POPR takes 4 bytes.

So if no regs needs saving we save 3 bytes, otherwise we add 5.
Also we save three bytes if we only are inside the red zone.

I wasn't thinking about space increase in the call, but the return. If you have multiple places of return, you need to both clean the stack, and restore registers at every place you do the return.

2) Make VAX use IEEE floats :-)
     Today virtually no floating point exist that is not IEEE. The only fragment around is probably the VAX floats.

This can hardly be a performance problem. So now we're talking about some compatibility or general behavior thing? But FP on VAX do have differences, in the hardware, that we cannot pretend don't exist. So is this about stupid programs that are making some assumptions that we fail, while still actually not caring enough about actual IEEE FP, or do we really want to be proper IEEE FP, in which case we're going to need to emulate in software, which will have a huge impact on performance.
We're talking about being able to compile and run existing programs without getting a headache.

Ok. I'm sortof fond of that idea, in that I expect that most programs will never care enough anyway. Not in the normal practical sense. However, I think there are slight differences on how a number is actually represented, so any floating values read in will turn out wrong, won't they? Or do you really mean that the values are similar enough that almost all values are represented the same way in the bit pattern?

  Johnny

--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: bqt%softjar.se@localhost             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol


Home | Main Index | Thread Index | Old Index