NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-mips/57680: printf("%.1f") shows wrong results on R3000mipseb



The following reply was made to PR port-mips/57680; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: port-mips-maintainer%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
	netbsd-bugs%netbsd.org@localhost, tsutsui%ceres.dti.ne.jp@localhost
Subject: Re: port-mips/57680: printf("%.1f") shows wrong results on R3000mipseb
Date: Tue, 14 Nov 2023 02:00:07 +0000

 > Date: Tue, 14 Nov 2023 02:20:50 +0900
 > From: Izumi Tsutsui <tsutsui%ceres.dti.ne.jp@localhost>
 > 
 > I wrote:
 > 
 > > At least changing ${DESTDIR}/usr/include/mips/fenv.h from
 > > >> static inline fpu_control_t
 > > >> __rfs(void)
 > > to
 > > >> static __noinline fpu_control_t
 > > >> __rfs(void)
 > > also solves the problem, but asm outputs are completely different
 > > in these two cases.
 > 
 > It looks gcc (7.5.0 from NetBSD 9.3) -O2 drops some nops
 > if __rfs() is defined as inline:
 
 This is weird but it is not necessarily a problem.  I checked some of
 the diffs you quoted below, and they don't appear to use the register
 being loaded until more than one instruction later, so there's no load
 delay hazards even without the nops.  For example:
 
 >          :	c7a00024 	lwc1	$f0,36(sp)
 > +        :	00000000 	nop
 >          :	c7a10020 	lwc1	$f1,32(sp)
 > +        :	00000000 	nop
 >          :	e7a0002c 	swc1	$f0,44(sp)
 >          :	e7a10028 	swc1	$f1,40(sp)
 
 Without the nops this is:
 
 	lwc1	$f0,36(sp)
 	lwc1	$f1,32(sp)
 	swc1	$f0,44(sp)
 	swc1	$f1,40(sp)
 
 So the load and store of $f0 are separated by an instruction that
 doesn't involve $f0, and similarly for $f1.
 
 Do you see any nops there that appear to be needed?
 
 It would be nice if we could isolate this to a smaller subroutine than
 dtoa, which is gigantic.  I tried something like this fragment but
 didn't get anywhere reproducing an obvious problem:
 
 void outofline(void);
 
 inline unsigned
 rfs_inline(void)
 {
 	unsigned fpsr;
 
 	asm("cfc1 %0,$31" : "=r"(fpsr));
 
 	return fpsr;
 }
 
 __attribute__((noinline)) unsigned
 rfs_noinline(void)
 {
 	unsigned fpsr;
 
 	asm("cfc1 %0,$31" : "=r"(fpsr));
 
 	return fpsr;
 }
 
 double
 foo(double *x, double *y, unsigned *z)
 {
 	outofline();
 	*z = rfs_noinline();
 	return *x + *y;
 }
 


Home | Main Index | Thread Index | Old Index