[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: pkg/47906: lang/g95: SEGV occurs when stack address is notaligned 8 bytes at main().
> On Wed, Jul 10, 2013 at 09:45:01AM +0000, SODA Noriyuki wrote:
> > > I don't beliebe that comment is true, I was more asking whether it just
> > > works...
> > The comment is true, at least at once.
> > I remember some x86 program which heavily used double significantly
> > ran slower occasionally. And the reason of the slowness was that
> > the program was accessing double on its stack, and the stack was not
> > always 8 byte aligned. It ran slowly, when the stack was not 8 byte
> > aligned due to some environment variable settings.
> Keep in mind that we don't keep the stack 64bit aligned on i386 at all.
> So chances are very high that it won't be preserved anyway. FP
> performance sensitive code wants to use SSE2 anyway.
> > That was more than 10 years ago.
> > But it's better to keep the optimization, unless you are 100% sure
> > that the optimization is really useless on all modern CPUs.
> I'm a lot more concerned about a working state than a questionable
> optimisation. All modern CPUs do memory fetches in terms of cache lines
> and the stack is pretty much guaranteed to be in the cache anyway, so it
> should really not matter.
Google [_double_8byte_align_] [search]
then it shows first:
>> 32bit 64bit - Why double in C is 8 bytes aligned?
>> The reason to align a data value of size 2^N on a boundary of
>> 2^N is to avoid the possibility that the value will be split
>> across a cache line boundary.
>> The x86-32 processor can fetch a double from any word boundary
>> (8 byte aligned or not) in at most two, 32-bit memory reads.
>> But if the value is split across a cache line boundary, then
>> the time to fetch the 2nd word may be quite long because of
>> the need to fetch a 2nd cache line from memory. This produces
>> poor processor performance unnecessarily. (As a practical matter,
>> the current processors don't fetch 32-bits from the memory at a
>> time; they tend to fetch much bigger values on much wider busses
>> to enable really high data bandwidths; the actual time to fetch
>> both words if they are in the same cache line, and already cached,
>> may be just 1 clock).
>> So, you should align doubles on 8 byte boundaries for performance
>> reasons. And the compilers know this and just do it for you.
g95 and most its applications are math packages.
They focus on performance. And there is no reason to care
dumb x86 ABI advantages just for silly binary compatibility.
The all things posted in this PR are:
- g95 has code that adjusts stackpointer for the performance reasons.
- g95 also has a silly bug that doesn't restore adjusted stackpointer.
It's quite likely because nowadays there are few users who still use
32 bit x86 for math simulations etc.
- Nonaka's patch just fixes g95 sources to restore the stackpointer.
It is a quite "right" and reasonable to fix the original problem.
- Then you replied a coment "this is wrong, it shouldn't do that."
- Next you posted "it shouldn't be needed at nothing on x86 requires
- This means you didn't read what the actual problem was at all.
It was not an ABI issue but a dumb careless bug mentioned above.
I.e. your comment was wrong. That's all.
If you are actually concerned about a working state,
the first nonaka's patch also brings it working state,
without visible changes against upsteam design.
No sense to complain about the original g95 design in this PR
that doesn't cause any problem (except the silly bug)
but gives a possible certain performance gain on x86_32.
It looks you are the guy who can't admit own stupid and wrong comment
and then posts random irrelevant claims to beg the original issue.
That's quite annoying for all communities.
Please stop that. pkgsrc is not your personal sandbox.
Please try to keep positive discussions, and pay your respect
to all other developers and contributers, to keep their motivations.
Main Index |
Thread Index |