Subject: Re: in_cksum asm chunks.
To: None <Richard.Earnshaw@arm.com>
From: Chris Gilbert <chris@paradox.demon.co.uk>
List: port-arm
Date: 08/31/2001 10:22:15
On Friday 31 August 2001  9:55 am, Richard Earnshaw wrote:
> > Hi,
> >
> > I've just been looking over the asm chunks in in_chksum_arm.c, and
> > noticed that we don't say that they overwrite the condition codes, I was
> > wondering if any knew if this isn't needed on arm?
>
> It's good form to mention this in the clobber list, but the compiler will
> assume that condition codes are clobbered by ASM statements.

That feels wrong, what if it's not clobbered by the asm stuff, does the 
compiler have to store the condition code away and reload/recalculate it?

> >  I was also pondering if there's a way
> > to really make sure that all the inputs are seperate registers,
>
> Yes you can, for outputs/temporaries that must not overlap inputs, put '&'
> in the constraint.
>
> > I was
> > wondering if you could infact have multiple outputs, eg:
>
> Yes you can
>
> > #define ADD4	__asm __volatile("	\n\
> > 	ldr	%1,[%3],#4		\n\
> > 	adds	%0,%0,%1		\n\
> > 	adcs	%0,%0,#0\n"		\
> >
> > 	: "=r" (sum), "=r" (tmp1)	\
> > 	: "0" (sum), "r" (w)		\
> > 	: "cc")
> >
> > The docs on asm don't seem to make this very clear.
>
> This operation has a side-effect on w which isn't mentioned in the
> constraints.

Yep, the original doesn't have tmp1 as an output at all, I was trying to get 
my head around how it should be.  I had fun a while back when I updated 
in_cksum as it had tmp1-tmp4 initialized to 0, so low and behold the compiler 
decided that it was going to pass the same register in for all the tmp 
inputs, so it looks like the existing contrainst are wrong.

> The best way to write this is
>
> #define ADD4	__asm __volatile("		\
> 	ldr	%2, [%0], #4			\n\
> 	adds	%1, %4, %2			\n\
> 	adcs	%1, %1, #0\n"			\
>
> 	: "=r" (w), "=r" (sum), "=&r" (tmp1)	\
> 	: "0" (w), "r" (sum)			\
> 	: "cc")

Thanks, this really helps clarify how asm works (certainly for me)  I'll try 
to find time to redo the in_cksum macros, from what you've said they're 
currently all wrong, but just happen to work.

Cheers,
Chris