Subject: Re: --db_more-- in recent sparc64 kernel
To: None <eeh@netbsd.org, heas@shrubbery.net>
From: None <eeh@netbsd.org>
List: port-sparc64
Date: 07/12/2001 22:22:54
| Thu, Jul 12, 2001 at 05:45:55PM -0000, eeh@netbsd.org:
| > 
| > 	> What happens if you type `q'?
| > 
| > 	back to the prom
| > 
| > 	chain: calling OF_chain(800000, edd0, 1000000, fffb1a80, 18)
| > 	Fast Data Access MMU Miss
| > 	{0} ok 
| > 
| > O.K.  When you get this you should type `.trap-registers' and record
| > the TT and TPC values.  The TPC values then need to be correlated with
| > your kernel image to find out what source line caused the fault.
| > `ctrace' may also be useful.
|
| didnt know about those prom cmds.  learned something new.  thanks!
|
| {0} ok .trap-registers
| %TL:1 %TT:4e %TPC:111b828 %TnPC:111b82c 
| %TSTATE:4400001601  %CWP:1 
|    %PSTATE:16 AG:0 IE:1 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0 
|    %ASI:0  %CCR:44  XCC:nZvc   ICC:nZvc

A level 14 interrupt hit here.  The only L14 I can think of is the
statclock.

|
| %TL:2 %TT:68 %TPC:f0003048 %TnPC:f000304c 
| %TSTATE:4400001501  %CWP:1 
|    %PSTATE:15 AG:1 IE:0 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0 
|    %ASI:0  %CCR:44  XCC:nZvc   ICC:nZvc

Here you have a MMU miss, and the PC looks like a PROM address.

| (gdb) x/a 0x111b544
| 0x111b544 <cnputc>:     0x9de3bf4011005033

x/i is more useful.

| (gdb) x/a 0x102dca4
| 0x102dca4 <db_force_whitespace+148>:    0x4003b62890102020
| (gdb) x/a 0x102dc10
| 0x102dc10 <db_force_whitespace>:        0x9de3bf401100501c
| (gdb) x/a 0x102de14
| 0x102de14 <db_putchar+76>:      0x7fffff7f2100501c
| (gdb) x/a 0x10590c8
| 0x10590c8 <putchar+408>:        0x7fff534090100018
| (gdb) x/a 1059af4
| Invalid number "1059af4".
| (gdb) x/a 0x1059af4
| 0x1059af4 <kprintf+124>:        0x7ffffd0f9410001d
| (gdb) x/a 1059358
| 0x102a1e:       Cannot access memory at address 0x102a1e

This is peculiar.

| (gdb) x/a 0x1059358
| 0x1059358 <db_printf+40>:       0x400001c89807a887
| (gdb) x/a 0x102eb40
| 0x102eb40 <db_add_symbol_table+144>:    0x4000a9fc90022200
| (gdb) x/a 0x102ad38

Looks like it was trying to print:

		db_printf("No slots left for %s symbol table", name);

This happens when you try to add more than MAXNOSYMTABS
tables.  That should be 21.

| 0x102ad38 <db_elf_sym_init+416>:        0x40000f5e96100019
| (gdb) x/a 0x102f340

db_elf_sym_init only adds one symbol table.  I'd guess you have
some sort of kernel corruption going on.  Maybe toolchain related.

| > 	kgdb too?  last i checked, i read that kgdb did not work.
| > 
| > kgdb has never worked and there is no effort being made to make it work.
|
| as in "never", or "not at this time"?

kgdb requires kernel support and gdb support.  Knowing the 
state of gdb, I'm not interested in tacking the problem.
Getting gdb semi-functional on userland binaries was enough
trouble.

Eduardo