[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: pkg/47906: lang/g95: SEGV occurs when stack address is not aligned 8 bytes at main().
Do you think my first patch is wrong yet?
Please discuss for speed at g95 project, not this PR.
2013/7/10 Joerg Sonnenberger <joerg%britannica.bec.de@localhost>:
> The following reply was made to PR pkg/47906; it has been noted by GNATS.
> From: Joerg Sonnenberger <joerg%britannica.bec.de@localhost>
> To: gnats-bugs%NetBSD.org@localhost
> Subject: Re: pkg/47906: lang/g95: SEGV occurs when stack address is not
> aligned 8 bytes at main().
> Date: Wed, 10 Jul 2013 12:12:02 +0200
> On Wed, Jul 10, 2013 at 09:45:01AM +0000, SODA Noriyuki wrote:
> > > I don't beliebe that comment is true, I was more asking whether it just
> > > works...
> > The comment is true, at least at once.
> > I remember some x86 program which heavily used double significantly
> > ran slower occasionally. And the reason of the slowness was that
> > the program was accessing double on its stack, and the stack was not
> > always 8 byte aligned. It ran slowly, when the stack was not 8 byte
> > aligned due to some environment variable settings.
> Keep in mind that we don't keep the stack 64bit aligned on i386 at all.
> So chances are very high that it won't be preserved anyway. FP
> performance sensitive code wants to use SSE2 anyway.
> > That was more than 10 years ago.
> > But it's better to keep the optimization, unless you are 100% sure
> > that the optimization is really useless on all modern CPUs.
> I'm a lot more concerned about a working state than a questionable
> optimisation. All modern CPUs do memory fetches in terms of cache lines
> and the stack is pretty much guaranteed to be in the cache anyway, so it
> should really not matter.
> > BTW, modern x86 CPUs still require some alignment restriction
> > for efficiency about SSE2 and AVX instructions.
> > See page 107 of http://www.agner.org/optimize/optimizing_cpp.pdf
> > for example.
> > 16byte or 32byte alignment is better for those instructions, though.
> Correct, but a separate issue. 128bit alignment is required for many of
> those. Newer GCC has support for doing adhoc realignment of the
> stack-frame, but that effectively would require using gfortran.
Main Index |
Thread Index |