Subject: A utility to work around SA rev2 problems
To: None <port-arm32@netbsd.org>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm32
Date: 11/21/1998 14:57:43
This is a multipart MIME message.

--==_Exmh_15175312360
Content-Type: text/plain; charset=us-ascii


This message will only really be of interest to RISC PC owners who have 
older StrongARMs fitted.

Enclosed below is a small utility that I wrote last year when I still had 
a SA rev2 (AKA rev K) -- I thought I'd lost it, but came across it last 
night while trawling an old HD.

It attempts to work around the problem that plagues the rev2 silicon by 
identifying the instructions that will cause problems and moving them to a 
safe location.  The "safe" location is final page of the text (code) 
segment of the executable, which normally has some spare bytes before the 
start of the data segment.  Having moved the instruction, it then puts a 
branch to the new location where the old instruction used to be so that 
things will continue to work.

THIS PROGRAM IS NOT A PANACEA.  IT MAY FAIL.  Three are several ways:

i) It doesn't actually fix any case other than ldm reg, {...., pc}; though 
it does, I think, detect all the cases that can potentially fail
ii) It won't fix up a shared library, though I guess that code could be 
added to so similar things to those.
iii) It may incorrectly identify data as an badly located instruction and 
try to fix it (so altering the data) -- this sort of failure may be very 
hard to detect at run time.
iv) There may be insufficient space at the end of the final page of the 
code segment to store the relocated instructions.  Some work could be done 
to the program to make it share like fixes, but this is not implemented.

The program prints out the address of each instruction that it relocates.  
If you wish you can have a poke around with gdb to check that each really 
is an instruction that should have been moved.

Having given all the above warnings, my personal experience was that I 
never had problems once a program had been "fixed" with this tool.

The syntax of the program is:

	fix4rev2 infile outfile

If the program completes without error, you then need to mark outfile as 
executable and do some testing; if all seems fine then you can then 
replace your original binary with the "fixed" version.  I personally used 
to keep the original as well, but renamed it.

My personal recommendation would be to only use this tool on programs that 
are giving you problems, but in the end it is entirely up to you.

I can't upload this to the ftp site, or I would put it there.  If someone 
with write access would like to do so, then I have no problems with that.

Patches to the above will be accepted if they seem sensible, but I 
no-longer have a SA-rev2, so I can't test that they are doing the right 
thing.

Have fun,

Richard.

--==_Exmh_15175312360
Content-Type: text/plain ; name="fix4rev2.c"; charset=us-ascii
Content-Description: fix4rev2.c
Content-Disposition: attachment; filename="fix4rev2.c"

/*
 * Copyright (c) 1997  Richard Earnshaw.  All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. All advertising materials mentioning features or use of this software
 *    must display the following acknowledgement:
 *      This product includes software developed by Richard Earnshaw.
 * 4. The name of the author may not be used to endorse or promote products
 *    derived from this software without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
 * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
 * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
 * GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
 * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
 * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */

/* Simple utility to find instructions in a binary that cause the
   SA110 rev<3 bug when doing a load-multiple or ldr which writes
   the PC.  */

/* NO ATTEMPT HAS BEEN MADE TO MAKE THIS WORK IN A CROSS ENVIRONMENT.
   IN PARTICULAR THE CODE ASSUMES THAT THE ENDIANNESS OF THE HOST
   MATCHES THAT OF THE CODE BEING FIXED. */

/* This file patches binaries as follows. 

     ldm<cond> reg, {reglist, pc} 
   is replaced with
     b<cond> patcharea
     ...
   patcharea:
     ldm reg, {reglist, pc}

   It also spots, but does not attempt to fix the more complex
     ldr<cond> pc, [pc, reg, lsl #2]  @ <cond> is normally ls
     b <addr>

   which could be replaced with
     str pc, [sp, #-4]!
     b patcharea
     ...
   patcharea:
     add<~cond> sp, sp, #4
     b<~cond> <addr>
     str treg, [sp, #-4]!	@ treg != reg
     ldr treg, [sp, #4]
     ldr treg, [treg, reg, lsl #2]
     str treg, [sp, #4]
     ldmia sp!, {treg, pc}

   This patch would be more dangerous, since it is possible for the second
   instruction to be the target of another branch, which would then be
   incorrect.  I don't think this case can cause a failure unless the
   branch table contains more than 1023 entries (since otherwise the load
   will take a data abort causing the page to be correctly mapped in).

   Perhaps a more significant case which is detected but not fixed is an
   indirect call:

     mov<cond>  lr, pc
     ldr<cond>  pc, [reg, ...]		@ pc not used in address

   This case could be fixed up by moving the ldr instruction to the
   patch area (leaving the condition in the branch).
 */

#include <stdio.h>
#include <a.out.h>
#include <stdlib.h>

/* All programs are loaded at this base address. */
#define LOAD_ADDR 0x1000

typedef struct fix_s
{
  struct fix_s *next;
  unsigned long where;
  unsigned long reloc;
  unsigned long inst;
  unsigned long nextinst;
} fix;

fix *fixlist = NULL;
fix *curfix = NULL;
int fixcount = 0;

static void usage(void)
{
  fprintf(stderr, "Usage: findbad <infile> <outfile>\n");
}

fix *newfix(unsigned long where, unsigned long inst)
{
  fix *new;

  if ((new = (fix *)(malloc(sizeof(fix)))) == NULL)
    {
      fprintf(stderr, "Out of memory during Malloc\n");
      exit(1);
    }

  new->next = NULL;
  new->where = where;
  new->inst = inst;
  new->reloc = new->nextinst = 0;

  if (fixlist == NULL)
    fixlist = new;
  else
    curfix->next = new;

  curfix = new;
  fixcount++;
  return new;
}

int main(int argc, char *argv[])
{
  FILE *in, *out;
  struct exec header;
  unsigned long size, max_size;
  unsigned long i;
  unsigned long fixup;
  union
  {
    char buf[4096];
    unsigned long a[1024];
  } x;

  if (argc != 3)
    {
      usage();
      exit(1);
    }

  if ((in = fopen(argv[1], "r")) == NULL)
    {
      perror(argv[1]);
      exit(1);
    }

  fread(&header, sizeof (header), 1, in);

  if (N_BADMAG(header) || N_GETMAGIC(header) != ZMAGIC)
    {
      fprintf(stderr, "%s: Not a valid executable file\n", argv[1]);
      exit(1);
    }

  /* For some types of Magic, the text segment includes the header in
     the address calculations.  XXX does this affect the length of the
     segment? */
  if (fseek(in, /*N_TXTOFF(header)*/ 0, SEEK_SET) < 0)
    {
      fprintf(stderr, "%s: Not a valid executable file\nBad text offset\n",
	      argv[1]);
      exit(1);
    }

  max_size = header.a_text;
  for (size = 0; size < max_size; size += 4096)
    {
      unsigned long inst;

      fread(x.buf, 1, max_size - size > 4096 ? 4096: max_size - size, in);

      /* Can't fix up a ldr pc, [pc,...] instruction just by moving it, so
	 we have to be a bit more clever */
      if (curfix != NULL && curfix->where + 4 == size
	  /* ldr?? pc, [pc, reg, lsl #2] */
	  && (curfix->inst & 0x0ffffff0) == 0x079ff100)
	{
	  curfix->nextinst = x.a[0];
	  printf("  %08x:\t", size + LOAD_ADDR);
#ifdef DISASS
	  disass(x.a[0]);
#else
	  printf(" %08x\tldr\tpc, [pc, ...]\n", x.a[0]);
#endif
	}

      inst = x.a[1023];
      if ((inst & 0x0e108000) == 0x08108000)
	{
	  printf("%08x:\t", size + 0xffc + LOAD_ADDR);
#ifdef DISASS
	  disass(inst);
#else
	  printf("%08x\tldm\treg, {..., pc}\n", inst);
#endif
	  newfix(size + 0xffc, inst);
	}
      else if ((inst & 0x0c10f000) == 0x0410f000)
	{
	  printf("%08x:\t", size + 0xffc + LOAD_ADDR);
#ifdef DISASS
	  disass(inst);
#else
	  printf("%08x\tldr\tpc, [reg...]\n", inst);
#endif
	  newfix(size + 0xffc, inst);
	}
    }

  if (fixcount == 0)
    {
      fprintf(stderr, "Nothing needs fixing\n");
      exit (0);
    }

  /* HACK, This assumes that max_size is a multiple of 4096.  */
  for (i = 1024; i-- > 0;)
    if (x.a[i] != 0)
      break;

  i++;
  i = 1024 - i;

  printf("%d words of zero in last page of text.\n", i);

  /* We can do better than this.  For load-multiple instructions, all we 
need
     to do is find an equivalent, non-conditional instruction elsewhere in
     the code segment.  We can then branch to that.  We only need patch 
space
     for instructions that do not appear elsewhere, and for the ldr pc 
     hackery.  */
  if (fixcount > i - 1)
    fprintf(stderr, "Insufficient space for patch instructions (need %d 
words)\n", fixcount);

  if ((out = fopen(argv[2], "w")) == NULL)
    {
      perror(argv[2]);
      exit(1);
    }

  fseek(in, 0, SEEK_SET);
  curfix = fixlist;
  /* Can't use the last word of the page, since that would fail also! */
  fixup = max_size - 8;
  for (size = 0; size < max_size; size += 4096)
    {
      fread(x.buf, 1, max_size - size > 4096 ? 4096 : max_size - size, in);

      /* This page needs fixing.  */
      if (curfix && (curfix->where & ~0xfff) == size)
	{
	  if ((curfix->inst & 0x0e108000) == 0x08108000) /* ldm */
	    {
	      curfix->reloc = fixup;
	      fixup -= 4;
	      if (x.a[1023] != curfix->inst)
		abort();
	      x.a[1023] = ((curfix->inst & 0xf0000000) | 0x0a000000
			   | ((curfix->reloc - (curfix->where + 8)) >> 2));
	    }
	  else
	    fprintf(stderr, "ldr pc not yet fixed up\n");

	  curfix = curfix->next;
	}

      if (max_size - size > 4096)
	fwrite(x.buf, 1, 4096, out);
    }

  for (curfix = fixlist; curfix != NULL; curfix = curfix->next)
    {
      if (curfix->reloc != 0)
	{
	  int word = (curfix->reloc & 0xfff) >> 2;

	  x.a[word] = 0xe0000000 | (curfix->inst & 0x0fffffff);
	}
    }
  fwrite(x.buf, 1, max_size - (size - 4096), out);

  while ((size = fread(x.buf, 1, 4096, in)) != 0)
    fwrite(x.buf, 1, size, out);

  fclose(in);
  fclose(out);

  return 0;
}

--==_Exmh_15175312360--