Re: I/O bus reset to fix CMD MSCP controllers (and probably others)

To: port-vax%netbsd.org@localhost
Subject: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
From: Johnny Billquist <bqt%softjar.se@localhost>
Date: Sat, 24 May 2025 13:09:53 +0200

I think the proper solution is to reset the controller in boot.

If we are playing around with the mapping, we certainly might be messingup controllers, and that could possibly be an issue even if we laterrestore mappings back as they were. So at worst that just leaves theproblem in place, but possibly makes it even more obscure.

Interesting find. And something I never even thought about. But I cancertainly see how this could be an issue with some firmware. It probablygets completely stuck in some loop trying to access memory.


  Johnny

On 2025-05-23 21:39, Hans Rosenfeld wrote:

Hi,

I was able to root-cause the issue. It was introduced in r1.11 of
uba_mainbus.c when scanning for Qbus/Unibus memories was added.

When NetBSD is booted from a MSCP controller, the boot loader sets up
the Qbus map to provide the controller with a small command/response
ring in memory to be used for I/O. Once the kernel is loaded and uba(4)
is attaching, the Qbus map is cleared while scanning for memories. It
appears that sudden loss of access to the command/response ring causes
the firmware of these CMD controllers to drop dead.

As a result, these controllers don't react in a reasonable time (100ms
by the spec) when their IP register is written to re-initialze the
controller. Even though uda(4)'s udamatch() waits up to 10s for a sign
of life from the controller, that's usually not enough for it to wake
up and it is assumed to be absent. Which, of course, causes the kernel
to fail booting as the boot device can't be found.

This needs to be addressed both in the kernel and the bootloader.

We can work around this issue in the kernel by restoring the Qbus map
registers to their original values after we've used them for detecting
Qbus memories. The code to do that is already there but commented out,
it just needs uncommenting.

To fix this properly, the standalone ra.c driver in the boot loader
should provide a close() entry point which clears the IP register to
issue a controller reset. Thus when the kernel is loaded, the firmware
of the MSCP controller is again in reset state and can't get confused
anymore by the initialization of the Qbus map in uba(4).


In case anyone wants to review them, I've attached two patches, one with
a kernel workaround and another with a proper fix for the boot loader. I
intend to commit them some time next week and request pullups into -10.


Hans


--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: bqt%softjar.se@localhost             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol

Follow-Ups:
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Hans Rosenfeld

References:
- I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Hans Rosenfeld
- Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
  - From: Hans Rosenfeld

Prev by Date: RE: KA655 Kernel Compile and Corrupt Object Files
Next by Date: Re: KA655 Kernel Compile and Corrupt Object Files
Previous by Thread: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
Next by Thread: Re: I/O bus reset to fix CMD MSCP controllers (and probably others)
Indexes:

Home | Main Index | Thread Index | Old Index