Subject: Re: --db_more-- in recent sparc64 kernel
To: None <eeh@netbsd.org>
From: john heasley <heas@shrubbery.net>
List: port-sparc64
Date: 07/12/2001 14:48:31
Thu, Jul 12, 2001 at 05:45:55PM -0000, eeh@netbsd.org:
> 
> 	> What happens if you type `q'?
> 
> 	back to the prom
> 
> 	chain: calling OF_chain(800000, edd0, 1000000, fffb1a80, 18)
> 	Fast Data Access MMU Miss
> 	{0} ok 
> 
> O.K.  When you get this you should type `.trap-registers' and record
> the TT and TPC values.  The TPC values then need to be correlated with
> your kernel image to find out what source line caused the fault.
> `ctrace' may also be useful.

didnt know about those prom cmds.  learned something new.  thanks!

{0} ok .trap-registers
%TL:1 %TT:4e %TPC:111b828 %TnPC:111b82c 
%TSTATE:4400001601  %CWP:1 
   %PSTATE:16 AG:0 IE:1 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0 
   %ASI:0  %CCR:44  XCC:nZvc   ICC:nZvc

%TL:2 %TT:68 %TPC:f0003048 %TnPC:f000304c 
%TSTATE:4400001501  %CWP:1 
   %PSTATE:15 AG:1 IE:0 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0 
   %ASI:0  %CCR:44  XCC:nZvc   ICC:nZvc

%TL:3 %TT:1ff %TPC:fffffffffffffffc %TnPC:fffffffffffffffc 
%TSTATE:8058041404  %CWP:4 
   %PSTATE:414 AG:0 IE:0 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:1 IG:0 
   %ASI:58  %CCR:80  XCC:Nzvc   ICC:nzvc

%TL:4 %TT:1ff %TPC:fffffffffffffffc %TnPC:fffffffffffffffc 
%TSTATE:8880001507  %CWP:7 
   %PSTATE:15 AG:1 IE:0 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0 
   %ASI:80  %CCR:88  XCC:Nzvc   ICC:Nzvc

%TL:5 %TT:1ff %TPC:fffffffffffffffc %TnPC:fffffffffffffffc 
%TSTATE:4446003500  %CWP:0 
   %PSTATE:35 AG:1 IE:0 PRIV:1 AM:0 PEF:1 RED:1 MM:0 TLE:0 CLE:0 MG:0 IG:0 
   %ASI:46  %CCR:44  XCC:nZvc   ICC:nZvc

{0} ok ctrace
PC: 111b828 
Last leaf: call 11246c4    from 111b818 
     0 w  %o0-%o5: (1 f 0 fff3c000 f005eaf0 fff980d8 )

jmpl  80d3f8  freelist        from 111b570 
     1 w  %o0-%o5: (0 20 111b7e0 1142980 102ab98 fff98108 )

call 111b544    from 102dca4 
     2 w  %o0-%o5: (20 fffffffe5b4bbff0 ffffffffd25a2008 1407000 e25e2008 fff98108 )

call 102dc10    from 102de14 
     3 w  %o0-%o5: (2d fff3c000 80a24011 1405400 803d7c fffb1538 )

call 102ddc8    from 10590c8 
     4 w  %o0-%o5: (4e 1453400 ffffffffffffffff a43b12000001 f005eaf0 1000000 )

call 1058f30    from 1059af4 
     5 w  %o0-%o5: (4e 10 0 10 18 18 )

call 1059a78    from 1059358 
     6 w  %o0-%o5: (1142a01 10 0 0 fffb1588 0 )

call 1059330    from 102eb40  
     7 w  %o0-%o5: (1142a00 1142980 15 1407368 1407000 fff3c350 )

call 102eab0    from 102ad38 
     8 w  %o0-%o5: (ffffffffffffffff fff604d0 1142980 fff3c000 f005eaf0 fff980d8 )

jmpl  0    from 102f340 
     9 w  %o0-%o5: (fff3c300 fff3c000 fff71ce8 1142980 102ab98 fff98108 )

call 102f314    from 102ea70 
     a w  %o0-%o5: (35ce8 fff3c000 fff71ce8 1142980 f005eaf0 fff98108 )

(gdb) x/a 0x111b544
0x111b544 <cnputc>:     0x9de3bf4011005033
(gdb) x/a 0x102dca4
0x102dca4 <db_force_whitespace+148>:    0x4003b62890102020
(gdb) x/a 0x102dc10
0x102dc10 <db_force_whitespace>:        0x9de3bf401100501c
(gdb) x/a 0x102de14
0x102de14 <db_putchar+76>:      0x7fffff7f2100501c
(gdb) x/a 0x10590c8
0x10590c8 <putchar+408>:        0x7fff534090100018
(gdb) x/a 1059af4
Invalid number "1059af4".
(gdb) x/a 0x1059af4
0x1059af4 <kprintf+124>:        0x7ffffd0f9410001d
(gdb) x/a 1059358
0x102a1e:       Cannot access memory at address 0x102a1e
(gdb) x/a 0x1059358
0x1059358 <db_printf+40>:       0x400001c89807a887
(gdb) x/a 0x102eb40
0x102eb40 <db_add_symbol_table+144>:    0x4000a9fc90022200
(gdb) x/a 0x102ad38
0x102ad38 <db_elf_sym_init+416>:        0x40000f5e96100019
(gdb) x/a 0x102f340
0x102f340 <X_db_sym_init+44>:   0x9fc3000090100018
(gdb) x/a 0x102ea70
0x102ea70 <ddb_init+140>:       0x4000022996100012

i'll look closer on sunday.  i built a DDB-free kernel and boots just fine.

> 	> What's probably happening is that there is a call to db_printf()
> 	> with a corrupt string.  db_printf() will attempt to print it and
> 	> generates a `--db_more--' when it thinks you're at the end of the
> 	> page.  It would be useful to know how you got there.
> 	> 
> 	> Is your kernel DEBUG or not?  If not, try compiling with DEBUG to
> 	> get some more boot diagnostics.
> 
> 	i removed the kernel build directory and built another from scratch.
> 	when i booted this kernel, it did not prompt and just went into the
> 	continuous spaces loop.
> 
> 	Boot device: disk  File and args: 
> 	NetBSD IEEE 1275 Bootblock
> 	..>> NetBSD/sparc64 OpenFirmware Boot, Revision 1.3
> 	>> (root@foad, Mon Jul  9 20:11:51 UTC 2001)
> 	loadfile: reading header
> 	elf64_exec: Booting /sbus@1f,0/SUNW,fas@e,8800000/sd@0,0:a/netbsd
> 	1453419@0x1000000+88000@0x1400000+309512@0x14157c0 
> 	symbols @ 0xfff3c300 74+147840+71699 start=0x1000000
> 	chain: calling OF_chain(800000, edd0, 1000000, fffb1a80, 18)
> 
> 	sending a break here dumps back to the prom.  suggestions?
> 
> 	oddly, i built a kernel on my other ultra2 with the same config file
> 	and cvs update from yesterday and it boots fine.  this box is exactly
> 	the same h/w and s/w wise, apart from 1G vs 256M ram and the compiler,
> 	which is the one from ~january while the other is the dwarf toolchain.
> 
> I have been noticing this sort of flakyness recently.  Building
> a kernel with one set of options works, and another set of options
> fails.  I have to attribute that to some sort of toolchain flakyness.
> 
> 	> 	being worked on?  tia.
> 	> 
> 	> No, the debugger should work.  But it appears you're not getting
> 	> far enough for the debugger to initialize.
> 
> 	kgdb too?  last i checked, i read that kgdb did not work.
> 
> kgdb has never worked and there is no effort being made to make it work.

as in "never", or "not at this time"?

> Eduardo