Re: Pausing/resuming CPU's in DDB

To: port-sparc64%netbsd.org@localhost
Subject: Re: Pausing/resuming CPU's in DDB
From: "Volkmar Seifert" <vs%nifelheim.info@localhost>
Date: Fri, 23 Mar 2012 08:59:55 +0100

>> How about increase the retry count in sparc64_send_ipi() ?
>>
>> Ours is now 1000, but FreeBSD is 5000, OpenBSD is 10000.
>
> Do both, or scale the retry count on cpu speed ...

We tried using various much higher values already when I posted my problem
about

"/netbsd: panic: cpu0: ipi_send: couldn't send ipi to UPAID2"

(that's the subject of the emails and the error occurring on my machine)

And as it is today, the machine is as unstable as ever, even though it
looked like a good work-around at first.
I can't even login into X without having to fear for the above mentioned
kernel panic.

Increasing the value has, IMHO, no real effect but stalling the panic for
a few loop-cycles more.

> If it only fails for ddb enter/exit the loop is fine, IMHO - but as we
> have seen other reports of ipi sending failure during normal operation, we
> should add the instrumentation Eduardo suggested and find out where we are
> blocked out that long (but this is mostly orthogonal to the topic at
> hand).

I'd be happy to provide testing ground, as my machine very reliably
crashes into this panic in this routine, no matter how high the count.
Sadly, my own knowledge in this particular field is rather limited, so I
don't really dare trying to do something here on my own...chances I break
something are way too high :)

But I can apply patches any time, compile them and try it out. Currently I
have the netbsd-5 and netbsd-6 branches checked out and ready for playing
on my systems, so if there's a patch and you want me to test it...just
send me an email and the branch to apply it to. (Though netbsd-6 would
require an upgrade of the whole system on my U60, which might fail due to
that panic)


- Volkmar

-- 
http://blog.nifelheim.info/tech

References:
- Pausing/resuming CPU's in DDB
  - From: Julian Coleman
- Re: Pausing/resuming CPU's in DDB
  - From: Takeshi Nakayama
- Re: Pausing/resuming CPU's in DDB
  - From: Martin Husemann

Prev by Date: ohci vs. DEBUG
Next by Date: perl tests failing in sparc64 netbsd-6
Previous by Thread: re: Pausing/resuming CPU's in DDB
Next by Thread: re: Pausing/resuming CPU's in DDB
Indexes:

Home | Main Index | Thread Index | Old Index