Subject: Re: Strange segmentation fault trying to run postgresql on
To: Alex Pelts <alexp@broadcom.com>
From: Markus W Kilbinger <mk@kilbi.de>
List: port-cobalt
Date: 04/29/2007 11:16:13
>>>>> "Alex" == Alex Pelts <alexp@broadcom.com> writes:
Alex> It is not the mapping that is the problem but alignment. It
Alex> is trying to store word (32 bits) at the address that is
Alex> aligned on 1 byte (cf). I don't know much about how netbsd
Alex> translates mips exceptions to signals but I think from your
Alex> register dump the problem is alignment.
I've investigated this a bit further (I can reproduce this problem on
my qube 2 (running -current kernel and userland) and
pkgsrc/databases/postgresql82(-server)). I've recompiled main/main.c
with -O0, otherwise the SIGSEGV seems to be 'hidden' in a branch delay
slot. Starting such postgres within gdb yields:
(gdb) run
Starting program: /usr/obj/pkg/databases/postgresql82-server/work.mipsel/postgresql-8.2.3/src/backend/postgres
Program received signal SIGSEGV, Segmentation fault.
0x00573bcc in main ()
(gdb) disas
[...]
0x00573bb8 <main+56>: jalr t9
0x00573bbc <main+60>: nop
0x00573bc0 <main+64>: lw gp,16(s8)
0x00573bc4 <main+68>: move v1,v0
0x00573bc8 <main+72>: lw v0,-13948(gp)
0x00573bcc <main+76>: sw v1,0(v0)
0x00573bd0 <main+80>: lw v0,-13948(gp)
0x00573bd4 <main+84>: lw v0,0(v0)
0x00573bd8 <main+88>: move a0,v0
0x00573bdc <main+92>: lw v0,-32632(gp)
0x00573be0 <main+96>: addiu t9,v0,16308
0x00573be4 <main+100>: jalr t9
(gdb) info registers
zero at v0 v1 a0 a1 a2 a3
R0 00000000 00000000 007e9f53 00830030 00830039 7fffda1d 00000000 7fffda14
t0 t1 t2 t3 t4 t5 t6 t7
R8 7fffda1c 00000000 00000008 00000000 8000001f ffffffe0 7fffd978 006f9a34
s0 s1 s2 s3 s4 s5 s6 s7
R16 7fffd9c0 7fffd8f8 00000001 7fffd8fc 007e9edc 7fffeff0 7dfb2e80 7dfaa000
t8 t9 k0 k1 gp sp s8 ra
R24 000007de 7dbf6da8 00000000 00000000 007ed250 7fffd898 7fffd898 00573bc0
sr lo hi bad cause pc
0000ff13 0009f79c 000000b4 007e9f53 00000014 00573bcc
fsr fir
007b6300 00000000
The problematic instruction seems to be
0x00573bcc <main+76>: sw v1,0(v0)
whereas 'v0' contains an unaligned address '007e9f53' for a word
access.
Astonishingly:
(gdb) x 0x007e9f53
0x7e9f53 <progname>: 0x00000000
... this seems to be the address of
const char *progname;
defined in main/main.c itself!?
So, how can this happen? Shouldn't be any pointer type variable
adequately aligned by cc/as/ld? (-> Bug within gcc/binutils?)
Maybe one of the mips guru's can help/comment here...
Markus.