NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-arm/55598: ChaCha self-test sometimes fails on evbarm-earmv7hf testbed



The following reply was made to PR port-arm/55598; it has been noted by GNATS.

From: Taylor R Campbell <campbell%mumble.net@localhost>
To: Andreas Gustafsson <gson%gson.org@localhost>
Cc: gnats-bugs%netbsd.org@localhost
Subject: Re: port-arm/55598: ChaCha self-test sometimes fails on evbarm-earmv7hf testbed
Date: Sat, 22 Aug 2020 23:35:33 +0000

 > Date: Sat, 22 Aug 2020 22:50:41 +0300
 > From: Andreas Gustafsson <gson%gson.org@localhost>
 >=20
 > It's probably more useful to consider the full set of outcomes from
 > the last 25 test runs (these include all the failures).  A zero
 > in the rightmost column means success, nonzero means failure:
 >=20
 >   lyta /bracket/evbarm-earmv7hf/results $ zgrep -c 'chacha: self-test fai=
 led' 2020/*/test.log.gz | tail -25
 >   2020/2020.08.02.01.36.46/test.log.gz:0
 >   [...]
 
 Cool, thanks.
 
 > >  (There have been some changes to sys/crypto/chacha over the time
 > >  period covered by those three dates.)  Does tests/sys/crypto/chacha
 > >  fail randomly if repeated?
 >=20
 > That would have to be tested separately, but I'm not sure I see the
 > point since the reported issue is not the failure of those tests but
 > of the kernel's built in self test, and we already know the tests in
 > [tests/sys/crypto/chacha] passed in all 25 of the above runs.
 
 tests/sys/crypto/chacha runs the kernel self-test code in userland.
 
 There are some small differences:
 
 - The kernel is built with -mfloat-abi=3Dsoft (and the ChaCha code with
   -mfloat-abi=3Dsoftfp), whereas in earmv7hf the userland is built with
   -mfloat-abi=3Dhard.
 
   However, this shouldn't make much of a difference for the ChaCha
   code, because there are differences in the first few blocks of
   output, which are produced without any vector parameter-passing,
   which is the only way that -mfloat-abi=3Dhard and -mfloat-abi=3Dsoftfp
   differ.
 
 - The kernel turns the fpu on and off around the crypto code, so the
   fpu state management is slightly different from userland.
 
 - The kernel code is built with a kludgey arm_neon.h NEON intrinsics
   header file (sys/crypto/chacha/arch/arm/arm_neon.h) while the
   userland code is built with the compiler's native arm_neon.h.
 
   However, this also shouldn't make much of a difference because the
   first few blocks are generated by a hand-written assembly routine
   rather than NEON intrinsics in C.
 
 So repeatedly running the userland t_chacha test may help to narrow
 down whether the problem might lie with these differences, if (say) a
 70 runs all succeed (if the failure rate is 5/26, the probability of
 no failures in 70 trials is below one in a million); or whether the
 problem lies with the ChaCha code itself, if the failure happens in
 userland too.
 
 Separately, I could add a path -- probably a sysctl knob -- by which
 to re-run the self-tests in the kernel without having to reboot.
 Would that be convenient for you to try via patch, or what would be
 the most convenient way to test this?
 
 > If you think testing this will actually yield some useful
 > information, which version would you like me to repeatedly test?
 
 What is currently in HEAD should be fine, since the last change to the
 ChaCha code and the fpu state management code was before the
 2020.08.19.22.47.09 and 2020.08.20.11.09.56 test runs that failed.  If
 you can conveniently run from those, though, that wouldn't hurt.
 


Home | Main Index | Thread Index | Old Index