Port-arm archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Kirkwood hang on boot; possibly uninitialised bss



On 23 May 2012 17:35, Robert Swindells <rjs%fdy2.co.uk@localhost> wrote:
>
> You can inspect netbsd using nm and objdump.
>
> % arm--netbsdelf-nm netbsd | sort > lst1
>
> % grep consinit_called lst1
> c038ab7c d consinit_called.9374
>
> Sorting the output will also let you see which variables are near to
> consinit_called in case it is being overwritten by something else.
>
> % arm--netbsdelf-objdump -j .data -s netbsd > lst2
>
> ...
>  c038ab40 a00901c0 b40901c0 c80901c0 dc0901c0  ................
>  c038ab50 9ca127c0 00000000 00000000 6c000000  ..'.........l...
>  c038ab60 02000000 28aa0ec0 54aa0ec0 28cd0ec0  ....(...T...(...
>  c038ab70 18cd0ec0 00000000 00000000 00000000  ................
>  c038ab80 00c20100 004b0000 00000000 00000000  .....K..........
>  c038ab90 0000ffff 00000000 00000000 c4c10ec0  ................
>  c038aba0 3d050000 04000000 ccbf28c0 74130000  =.........(.t...
>  c038abb0 01000000 f0bf28c0 74130000 02000000  ......(.t.......
> ...
>
> You should be able to use sprintf() at this stage in the boot, you can
> use it to print values to a buffer then pass it to your KW_PUTS()
> function.
>
> It is linking fine for me, how are you building the kernel ?

I'm cross-compiling from a Linux amd64 machine:

  ./build.sh -u -m evbarm kernel=XYZ


Ok, so here's where I am currently. In the kernel image, my symbol has
all-bytes-zero:

  iona% arm--netbsdelf-nm netbsd | sort | grep consinit_called
  c046e7bc d consinit_called.9907
  iona%
  iona% arm--netbsdelf-objdump -j .data -s netbsd | grep -C 2 c046e7b.
   c046e790 f0c034c0 00000000 00000000 6c000000  ..4.........l...
   c046e7a0 02000000 e04915c0 0c4a15c0 908715c0  .....I...J......
   c046e7b0 808715c0 00000000 00000000 00000000  ................
   c046e7c0 00c20100 004b0000 00000000 00000000  .....K..........
   c046e7d0 0000ffff 00000000 f86836c0 78000000  .........h6.x...
  iona%

Unfortunately calling sprintf hangs. So I wrote something to print out
hex values myself. Please excuse them being in reverse order.

  void KW_DUMP(void *q, size_t z) {
      size_t i;
      const unsigned char *p;

      p = q;

      for (i = 0; i < z; i++) {
          const char *hex = "0123456789ABCDEF";
          KW_DBG(hex[(p[i] >> 4) & 0xf]);
          KW_DBG(hex[(p[i] >> 0) & 0xf]);
      }
  }

The value of that symbol as printed at runtime, and its address:

  {
      void *acic;

      KW_PUTS("consinit_called = ");
      KW_DUMP(&consinit_called, sizeof consinit_called);
      KW_PUTS("\r\n");

      acic = &consinit_called;

      KW_PUTS("&consinit_called = ");
      KW_DUMP(&acic, sizeof acic);
      KW_PUTS("\r\n");

      acic = (void *) (0x8000U + (unsigned) &consinit_called - 0xc0000000U);

      KW_PUTS("phys. 2 &consinit_called = ");
      KW_DUMP(&acic, sizeof acic);
      KW_PUTS("\r\n");

      KW_PUTS("phys. 2 consinit_called = ");
      KW_DUMP(acic, sizeof consinit_called);
      KW_PUTS("\r\n");
   }

Prints:

  consinit_called = FCE1EBCB
  &consinit_called = BCE746C0
  phys. 2 &consinit_called = BC674700
  phys. 2 consinit_called = 00000000

Fortunately this pattern (FCE1EBCB, or rather CBEBE1FC) is quite recognisable.
I only found one occurance in the kernel image, so I assume that's the
same pattern I'm seeing. It lives at c015588 relative to the start of the
kernel.

Now, I can see that from U-boot:

  Marvell>> md 0046e7bc
  0046e7bc: cbebe1fc 3de0f3fb 7fc4e7af f5a473bf    .......=.....s..

Which I zeroed out just to be sure I'm looking at the right thing:

  Marvell>> mw 0046e7bc 0 4
  Marvell>>
  Marvell>> md 0046e7bc
  0046e7bc: 00000000 00000000 00000000 00000000    ................

And lo and behold, booting prints out zeroes for my symbol's value:

  Marvell>> tftpboot 2000000 netbsd.gz.ub; bootm 2000000
  ...
  consinit_called = 00000000
  &consinit_called = BCE746C0

So to recap:

 - the value in the kernel image is correct (all zeroes);
 - the value in the kernel image after tftpboot copies it
   (to 0x2000000 onwards) is correct;
 - the value after relocating to the load address 0x8000
   is now somehow FCE1EBCB

So, my current hypothesis is that during relocation, the address of this
symbol is somehow being pointed at the wrong thing. I think the zeroed
bytes are present - somewhere - but the symbol is looking at some other
unrelated part of the image.

It feels like an offset is wrong, somewhere. I don't know where.
Does that sound reasonable?


Next:

 - Try to find where the all-zeroed bytes actually are, after relocating
 - Find what the difference in offset is.

I tried this last night, and I tentatively think this symbol is about
318F3C bytes off from where it ought to be. But that's not a nice round
number, and so I am dubious of it.


Am I on the right track?

-- 
Kate


Home | Main Index | Thread Index | Old Index