Re: Automated report: NetBSD-current/i386 test failure

To: Robert Elz <kre%munnari.OZ.AU@localhost>
Subject: Re: Automated report: NetBSD-current/i386 test failure
From: Roy Marples <roy%marples.name@localhost>
Date: Sun, 20 Sep 2020 04:56:04 +0100



On 20/09/2020 04:40, Robert Elz wrote:

     Date:        Sun, 20 Sep 2020 04:02:45 +0100
     From:        Roy Marples <roy%marples.name@localhost>
     Message-ID:  <51d2f8dc-d059-5eae-9899-5c91539d1ac0%marples.name@localhost>

   | The test case just needed fixing.

That is not uncommon after changes elsewhere.

   | The ping to an invalid address caused the ARP entry to enter INCOMPLETE ->
   | WAITDELETE state and this hung over into the next test casing this entry
   | to take too long to validty resolve.

Why?   If a failed ARP (or ND) causes problems for a later request
(incl of the same addr) which should work (that is, any problems at all,
including delays) then I'd consider the implementation broken (not the test).


RFC 7048 expands that consistent failures expontentialy backoff.

Because the server is not reset the backoff may bleed into subsequet tests forthe same address which why this test was sometimes failing.


   | The solution is after a deliberate fail

And if it wasn't a deliberate fail?  Perhaps being just a fraction of a
second too quick, and attempting a ping (or ssh, or something) just before
the destination becomes reachable (either because it was down, unconfigured,
or the net link between then wasn't functional), and


ATF timings on an emulated environment cannot be that precise.
See PR 43997 for more details.


   | to remove the ARP entry for the address

if the user doing this isn't root, and cannot just remove ARP entries?

Maybe I'm misunderstanding the actual scenario, but it seems to me
that things aren't working as well now as they were before (the timing
in the qemu tests hasn't changed recently - not since the nvmm version
started being used - but before the arp implementation change, it used
to work reliably).

By reliably you mean that a successful ARP resoltion lasts for 20 minutes whichwe don't have any tests for?If anything the tests we have are more reliable than before as I have notadjusted any timings.

Roy

References:
- Re: Automated report: NetBSD-current/i386 test failure
  - From: Roy Marples
- Re: Automated report: NetBSD-current/i386 test failure
  - From: Roy Marples
- Re: Automated report: NetBSD-current/i386 test failure
  - From: Roy Marples
- Automated report: NetBSD-current/i386 test failure
  - From: NetBSD Test Fixture
- Re: Automated report: NetBSD-current/i386 test failure
  - From: Robert Elz
- Re: Automated report: NetBSD-current/i386 test failure
  - From: Robert Elz
- Re: Automated report: NetBSD-current/i386 test failure
  - From: Robert Elz

Prev by Date: Re: Automated report: NetBSD-current/i386 test failure
Next by Date: "tsc went backwards" spam on resume
Previous by Thread: Re: Automated report: NetBSD-current/i386 test failure
Next by Thread: Automated report: NetBSD-current/i386 test failure
Indexes:

Home | Main Index | Thread Index | Old Index