NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-vax/55415: vax no longer preempts in a timely fashion



On 6/26/20 12:55 AM, Anders Magnusson wrote:
The following reply was made to PR port-vax/55415; it has been noted by GNATS.

From: Anders Magnusson <ragge%tethuvudet.se@localhost>
To: gnats-bugs%netbsd.org@localhost, port-vax-maintainer%netbsd.org@localhost,
  gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, oster%netbsd.org@localhost
Cc:
Subject: Re: port-vax/55415: vax no longer preempts in a timely fashion
Date: Fri, 26 Jun 2020 08:54:15 +0200

  Den 2020-06-25 kl. 17:00, skrev Greg Oster:
  > The following reply was made to PR port-vax/55415; it has been noted by GNATS.
  >
  > From: Greg Oster <oster%netbsd.org@localhost>
  > To: gnats-bugs%netbsd.org@localhost
  > Cc:
  > Subject: Re: port-vax/55415: vax no longer preempts in a timely fashion
  > Date: Thu, 25 Jun 2020 08:55:50 -0600
  >
  >   On 6/25/20 3:15 AM, Anders Magnusson wrote:
  >   > The following reply was made to PR port-vax/55415; it has been noted by GNATS.
  >   >
  >   > From: Anders Magnusson <ragge%tethuvudet.se@localhost>
  >   > To: gnats-bugs%netbsd.org@localhost
  >   > Cc:
  >   > Subject: Re: port-vax/55415: vax no longer preempts in a timely fashion
  >   > Date: Thu, 25 Jun 2020 11:10:42 +0200
  >   >
  >   >   Will it work if you only restore the removed line in cpu.h?
  >
  >   Yes, yes it does!  So it's just one line that needs to be restored to
  >   get things working properly.
  >
  Great!
The other missing line should not be needed as I understand the code in
  sched_resched_cpu().
  ci_want_resched should always be set already when cpu_need_resched() is
  called.
I'll try to fire up my 4000/90 this weekend and see if I can find this bug.

I've done a bit more debugging... What I'm seeing is that in kern_runq.c:sched_resched_cpu() the call to cpu_need_resched(ci, l, f) happens, cpu_need_resched() sets up the AST. Except it's only once in a while that the trap with the AST fires, userret() gets called, and preemption happens! Sometimes the trap with AST fires once, and not again... sometimes it fires 5 times in a row, and then misses.... but I don't know why an AST that has been posted would subsequently get missed sometimes....

So it's able to hit a situation where cpu_need_resched() is called, but the corresponding AST never fires. The loop in sched_resched_cpu() that sets ci->ci_want_resched keeps thinking (correctly!) that the AST has already been setup, and so doesn't try to call cpu_need_resched() again. When it gets 'stuck' like this, we never see an AST until the process completes. (nor do we see preemption until the process completes.)
That seems to be because if I check the AST status with:

 if (mfpr(PR_ASTLVL) != AST_OK)

that condition is always true... (meaning the AST is not setup...)

Any ideas on how an AST can just 'disappear'? (I'm using the same mfpr() check right after the mtpr() setting of PR_ASTLVL, and there it thinks it's set just fine... so how does it go missing a few moments after????)

Later...

Greg Oster


Home | Main Index | Thread Index | Old Index