tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: How does PCIe appear to the host?



    Date:        Thu, 3 Oct 2024 01:13:52 -0400 (EDT)
    From:        Mouse <mouse%Rodents-Montreal.ORG@localhost>
    Message-ID:  <202410030513.BAA08646%Stone.Rodents-Montreal.ORG@localhost>

  | One of these machines is an ASRock Q1900M.  It has only two SATA ports
  | onboard; it has two PCIe x1 slots and a PCIe x16 slot.  I just today
  | picked up a 5-port PCIe SATA card and tried it.

I have one of those (though 6-port I think) - also requires x4, I have
it in an x8 slot (motherboard has no x4s).   (I have a lot of drives
connected, this isn't because the motherboard has limited SATA.)

The first of them I had obviously required BIOS (firmware) support to
be enabled - it used to periodically vanish, as in even PCI config
cycles would not find the thing, just as you reported, it simply appeared
not to be there at all.

Playing around with the firmware settings, turning on various legacy
modes (often including getting it into modes that would prevent NetBSD
booting) would make it appear, though I was never confident that I ever
had a real handle on what change, or combination of changes, did it.
It took a bios reset (so it could rescan the busses) after the config
changes - but that's normal on most systems as a default action I think.
After that all would remain OK, even when I returned the settings to
their "proper" values, so NetBSD could boot successfully, until there
was a complete power loss (no power at all to the card) which would
generally put it back to "invisible" state again.

After that I managed to break the card (as in physically break, by pulling
too hard on a SATA cable that was still plugged in, breaking the connector).
It still worked like that, but the SATA cable connection was not very stable,
so I eventually replaced it.

The newer one (same model, same brand, but about a year younger) has had
no issue making itself visible without any weird firmware config changes).

Even to the point now that I almost trust it enough to disable my "detect
when it has failed" config changes - I ran raidframe mirrors with one
drive on the motherboard controller, and one on the add-in, so that if
one of them should fail, the other can keep things running.   But when the
controller vanished, if the system booted, it would merely see failed
raid components and continue (once, early on, I didn't even notice that
the add-in controller had vanished for a few days!).   That should be
harmless, but my raid mirrors are 12TB or beyond, and every time this would
happen - which wasn't all that rare, power glitches aren't uncommon here,
and were worse back then than now, and I didn't yet have a UPS, the raids
would all need a reconstruct, which for the biggest would take more than
24 hours (and not even any guarantee they would complete without there
being another interruption).    So, I disabled raidframe autoconf (I never
needed it for booting, I boot from manually mirrored filesystems, so if I
mess up an update to one, I can boot from the other and fix things, rather
than need to get involved in CD/USB booting).  Then I installed a rc.d
script, set to run very very early, before raidframe would run, which
would just look for the presence of the drives on the add-in controller,
and abort the boot if none were present.   That worked well, and informed
me when I need to play with the firmware to make it come back again.
None of that has been needed since the controller was replaced.

All this boils down to: the controller is probably not dead (but of
course, might be), and playing with firmware settings, as Brian suggested,
might work, just hope that your bios has the right kind of thing to put it
into a state where it does complete hardware initialisation (which used to
be needed on ancient WinTrash I believe, so most older ones should be able
to do that, one way or another).

Once visible it should show up in NetBSD autoconfig, with a very high
probability that it will be known as a SATA controller, and "just work",
though if you're doing this with a very old NetBSD kernel, it is possible
that the chipset has no support.

But note that in mine, even when the card is visible to NetBSD, the
firmware shows no signs of recognising it at all, at least not in its
visible interface.   I recently discovered that it does apparently
know there are drives out there, as it can locate them, and boot from
them - I recently swapped two drives between the controllers (largely
as I thought I needed to, as I wanted to be able to boot from the one
that had been on the add-in) - only to find that the one now moved to
the add-in controller (which had an ESP partition, and could be booted
from previously) still appears in the boot menu (EFI boot) but
giving no indication there which drive it is booting from, where the
boot images in the drives on the motherboard controller do.  That's
how I can tell the difference between boot images, as NetBSD doesn't
yet (hopefully soon perhaps) have the ability to name the firmware
boot choices the way FreeBSD Linux and WinTrash can do, my firmware
just shows the ones it finds by itself as "UEFI Boot" with the drive
name (in firmware notation) appended - except for from that add in
controller (if I had two bootable drives out there, I'd be in trouble!)

  | The reason I'm asking about PCIe is that, as far as I can tell from the
  | host, it isn't there at all.

Yes, that's exactly what I would see when it was in its "hidden" (probably
not enabled by the firmware) mode.

  | I note a possible conflict between the "x1" and the presence of a x16
  | slot; that 1 is coming from the PCIE_LCAP_MAX_WIDTH bits in PCIE_LCAP,
  | which makes me wonder whether something needs configuring to run the
  | x16 slot at more than x1.

likely to depend upon what (if anything) is plugged in, and enabled.
The firmware should set that up, it generally depends upon what cards
are connected, and how many lanes each claims to require.

But again, "broken" is also a possibility.

  | so if the x16 slot is running x1 (is that even possible?)

It is, you can always run any lower number of lanes than the
slot supports (at least in powers of two).   Some (usually high
end) motherboards (and bios) even support dividing the slot, so
it appears as more lower capacity ones.

kre



Home | Main Index | Thread Index | Old Index