Port-vax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Patches to fix optimizer bug & C++ exceptions for GCC VAX backend



> On Apr 4, 2016, at 07:51, Maciej W. Rozycki <macro%linux-mips.org@localhost> wrote:
> 
> On Thu, 31 Mar 2016, Jake Hamby wrote:
> 
>> There's one more thing that's broken in the VAX backend which I'd 
>> *really* like to fix: GCC can't compile many of its own files at -O2, as 
>> well as a few other .c files in the NetBSD tree, because it can't expand 
>> an insn pattern. The Makefiles have hacks to add "-O0" on VAX, and when 
>> I remove that workaround, I get GCC assertion failures, all of the form:
>> 
>> /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/builtins.c: In function 'void maybe_emit_chk_warning(tree, built_in_function)':
>> /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/builtins.c:11549:1: error: unrecognizable insn:
>> }
>> ^
>> (insn 295 294 296 25 (set (reg:SI 111)
>>        (subreg:SI (mem:DI (plus:SI (mult:SI (reg:SI 109)
>>                        (const_int 8 [0x8]))
>>                    (reg/f:SI 45 [ D.85015 ])) [7 *_98+0 S8 A32]) 4)) /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/wide-int.h:799 -1
>>     (nil))
>> /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/builtins.c:11549:1: internal compiler error: in extract_insn, at recog.c:2343
>> 0xbd0365 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
>> 	/home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/rtl-error.c:110
>> 0xbd03fa _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
>> 	/home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/rtl-error.c:118
>> 0xb92a2d extract_insn(rtx_insn*)
>> 	/home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/recog.c:2343
>> 0x9612cd instantiate_virtual_regs_in_insn
>> 	/home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/function.c:1598
>> 0x9612cd instantiate_virtual_regs
>> 	/home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/function.c:1966
>> 0x9612cd execute
>> 	/home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/function.c:2015
>> 
>> The failures all seem to be related to trying to read a value from an 
>> array of 64-bit values and loading it into a 32-bit register. It seems 
>> like there should be a special insn defined for this sort of array 
>> access, since VAX has mova* and pusha* variants to set a value from an 
>> address plus an index into a byte, word, long, or 64-bit array (it uses 
>> movab/pushab, put not the other variants). The addressing modes, 
>> constraints, and predicates all get very complicated, and I still need 
>> to understand a bit better what is actually required, and what could be 
>> simplified and cleaned up.
> 
> Please note however that this RTL instruction is a memory reference, not 
> an address load.  So the suitable hardware instruction would be MOVAQ for 
> an indexed DI mode memory load.  If used to set a SI mode subreg at byte 
> number 4 (this would be the second hardware register of a pair a DI mode 
> value is held in) as seen here, it would have to address the immediately 
> preceding register however (so you can't load R0 this way) and it would 
> clobber it.
> 
> So offhand I think you need an RTL instruction splitter to express this, 
> and then avoid fetching 64 bits worth of data from memory -- for the sake 
> of matching the indexed addressing mode -- where you only need 32 bits.  
> At the hardware instruction level I'd use a scratch register (as with 
> MOVAQ you'd have to waste one anyway) to scale the index and then use 
> MOVAL instead with the modified index.  Where no index is used it gets 
> simpler even as you can just bump up the displacement according to the 
> subreg offset.

Thanks for the info! I've discovered a few additional clues which should help, namely the optimizer pass that's introducing the problem. Through process of elimination, I discovered that adding "-fno-tree-ter" will prevent the unrecognizable insn error. Strangely, I can't cause the problem by using "-ftree-ter" and -O0, which seems odd, unless the code is checking directly for a -O flag.

Here's an example of a similar line of code (it seems to be triggered by accessing a 64-bit int array embedded inside a struct) that fails w/ -O, but succeeded with the addition of -fno-tree-ter (assembly output with "-dP"):

#(insn 682 680 686 (set (reg:DI 0 %r0 [orig:136 D.5219 ] [136])
#        (mem:DI (plus:SI (plus:SI (mult:SI (reg/v:SI 11 %r11 [orig:138 j ] [138])
#                        (const_int 8 [0x8]))
#                    (reg/v/f:SI 6 %r6 [orig:55 dp ] [55]))
#                (const_int 112 [0x70])) [5 MEM[base: dp_3, index: _569, step: 8, offset: 112B]+0 S8 A32])) /home/netbsd/current/src/sbin/fsck_ffs/pass1.c:345 11 {movdi}
#     (expr_list:REG_EQUIV (mem:DI (plus:SI (plus:SI (mult:SI (reg/v:SI 11 %r11 [orig:138 j ] [138])
#                        (const_int 8 [0x8]))
#                    (reg/v/f:SI 6 %r6 [orig:55 dp ] [55]))
#                (const_int 112 [0x70])) [5 MEM[base: dp_3, index: _569, step: 8, offset: 112B]+0 S8 A32])
#        (nil)))
        movq 112(%r6)[%r11],%r0 # 682   movdi


That's a pretty complex instruction: base address + (array offset * 8) plus a constant offset! It's the attempt to transform this into setting a reg:SI from a subreg:SI from the mem:DI that it fails to find a matching insn.

I'll definitely look into the areas you mentioned, because I think you're right about where the basic problem is. It sounds like there isn't a bug in the particular optimization pass, but the addressing mode is so complicated that when you change the movq to a movl, you suddenly can't do the array offset correctly in one insn. I also think you're absolutely correct about the need for a scratch register, and that the movaq/movq version would have wasted one anyway.

The other problem area I want to fix is why the generated move instruction to overwrite the return address in expand_eh_return() is being deleted by the optimizer (my guess is that it's a bad dead store elimination, but I could be off-base). That's the part I hacked around to get C++ exceptions working for now with the RTX_FRAME_RELATED_P line I added in gcc/except.c:

#ifdef EH_RETURN_HANDLER_RTX
      rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
      RTX_FRAME_RELATED_P (insn) = 1;   // XXX FIXME in VAX backend?
#else

That hack shouldn't be necessary, and it introduces other problems (adds unnecessary .cfi metadata and caused "-Os" compiles to fail for me). So there has to be something else going on, especially since other platforms define EH_RETURN_HANDLER_RTX and seem to depend on the same behavior for their __builtin_eh_return(). But then most of those platforms put the return address in a register, or relative to %sp, not %fp. I could easily see some machine-independent part of the code thinking the frame-pointer reference meant it was a local variable store and not important.

I'll continue to clean up the diffs that I've been working on, and send out something when I've made more progress. I like the "cc" attr code that I've added to fix the overaggressive elimination of CC0 tests, but the problem is that now it's too conservative, because many insns are defined as calls into C code which may or may not generate insns that set the condition codes. I have a few ideas on how to clean up and refactor that code, but first I had to convince myself that most of what's in there now are actually useful optimizations, and they seem to be.

Thanks for the help!
-Jake


Home | Main Index | Thread Index | Old Index