Port-alpha archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Self baked kernel panics

First of all, thanks for your support and that you even took the time to 
explain. Even though I still do not really understand it, it gives at least 
an idea, which is worth quite a bit. 

Below some statistiks for generations to come. When I remember that once in 
the FreeBSD make.conf was a huge warning to not use -O2 on alpha as it was 
known to create broken code and now the kernel only comes up at all when 
optimizations are used...well, times tey are a changing. Currently it is 
not -mcpu, but -O0 to blame.

Those kernels were made without the patch from itoh (kudos to you as well, of 
course) as trying to apply the patch produces an error:

nordlicht# patch -C isp.c PATCHFILE
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
|--- src/sys/dev/ic/isp.c.orig   Thu Nov 16 10:32:51 2006
|+++ src/sys/dev/ic/isp.c        Sat Feb  9 15:07:16 2008
Patching file isp.c using Plan A...
Hunk #1 failed at 2957.
Hunk #2 failed at 3204.
2 out of 2 hunks failed--saving rejects to isp.c.rej
Hmm...  Ignoring the trailing garbage.

Same is true when using "patch -C < PATCHFILE"

And now for the GENERIC facts for ricers:

COPTS+=-mcpu=21164a (same as -O0 -mcpu=21164a)
make: 1434.00s real  1258.00s user   184.40s system
Size = 12400k
Panic: yes

make: 1450.00s real  1272.00s user   185.02s system
Size =13104k
Panic:  yes

COPTS+=-Os -mcpu=21164a
make: 2774.00s real  2572.00s user   209.27s system
Size = 8272k
Panic: no

COPTS+=-O2 -mcpu=21164a
make: 3010.00s real  2795.00s user   221.76s system
Size = 9168k
Panic: no

#COPTS+= (seems to automagically use -O2 if COPTS is not defined)
make: 3200.00s real  2982.00s user   224.45s system
Size =9568k
Panic:  no

Am Freitag, 8. Februar 2008 schrieb Michael L. Hitch:
> On Fri, 8 Feb 2008, Ede Wolf wrote:
> > Still, here are the traces. I admit, I have no clue what I am doing, but
> > may it be helpful. If you need other data, just let me know.
>    I sort of know what I'm doing, and it's a good start.
> > db> x/i 0xfffffc00004ea978
> > netbsd:isp_start+0x440: stq_u   zero,4(t0)
>    This is the faulting instruction, 0xfffffc00004ea978 is the PC at the
> time of the fault.
> > db> x/i 0xfffffc00004ea978,20
> > netbsd:isp_start+0x440: stq_u   zero,4(t0)
> > netbsd:isp_start+0x444: stq_u   zero,c(t0)
>    This should help me figure out the corresponding location in the source
> code, and provide some clue as to what when wrong.  Gcc 4 scatters the
> code around much more than gcc 3 did, and it can be fun trying to figure
> out how the source code matches up with the instructions.  A gdb version
> of the kernel with the source files and a kernel core dump file makes that
> much simpler (using the "list *address" command), but I'm used to doing it
> the hard way.
> > db> x/i 0xfffffe000c589a66
> > 0xfffffe000c589a66:     sts     f23,-1(v0)
>    This address (the A0 entry from the trap) is actually the unaligned
> address that caused the trap.  The "stq_u   zero,4(t0)" needs an aligned
> address (4 byte alignment at least - I don't know if the alpha would need
> an 8 byte alignment for the stq instruction).  Since it's not on a 4 byte
> alignment, it will crash.
> > db> x/i 0xfffffc00004ea890
> > netbsd:isp_start+0x358: or      zero,v0,t0
>    This is the RA (return address) at the time of the fault, and the call
> to the function where the fault occurred should be the instruction
> preceeding this address.  This should help in verifying the location of
> the function that the fault occurred in.
> >>> CPU 0    trap entry = 0x4 (unaligned access fault)
> >>> CPU 0    a0         = 0xfffffe000c589a66  <-- address that caused
>                                                      the fault
> >>> CPU 0    a1         = 0x2c
> >>> CPU 0    a2         = 0x1f
> >>> CPU 0    pc         = 0xfffffc00004ea978  <-- program counter
>                                                      at instruction that
>                                                      faulted
> >>> CPU 0    ra         = 0xfffffc00004ea890  <-- Return address
>                                                   register at time of
>                                                      fault
> >>> CPU 0    pv         = 0xfffffc0000a697c0
> >>> CPU 0    curlwp    = 0xfffffc000fbe9ba0
> >>> CPU 0        pid = 3, comm = scsibus1
> --
> Michael L. Hitch                      mhitch%montana.edu@localhost
> Computer Consultant
> Information Technology Center
> Montana State University      Bozeman, MT     USA

Home | Main Index | Thread Index | Old Index