tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Maxphys on -current?



> On Aug 3, 2023, at 2:19 PM, Brian Buhrow <buhrow%nfbcal.org@localhost> wrote:
> 
> hello.  I know that this has ben a very long term project, but I'm wondering about the
> status of this effort?  I note that FreeBSD-13 has a Maxphys value of 1048576 bytes.
> Have we found other ways to get more throughput from ATA disks that obviate the need for this
> setting which I'm not aware of?
> If not, is anyone working on this project?  The wiki page says the project is stalled.

If someone does pick this up, I think it would be a good idea to start from scratch, because MAXPHYS, as it stands, is used for multiple things.  Thankfully, I think it would be relatively straightforward to do the work that I am suggesting incrementally.

Here goes...

MAXPHYS is really supposed to be “maximum transfer via physio”, which is the code path you use when you open /dev/rsd0e and read/write to it.  The user-space pages are wired and mapped into the kernel for the purpose of doing I/O.  MAXPHYS is a per-architecture constant because some systems have different constraints as to how much KVA space can be used for that at any given time.

Unfortunately, some of the adjacent physio machinery (e.g. minphys()) is also used for other purposes, specifically to clamp I/O sizes to constraints defined by the physical device and/or the controllers / busses they’re connected to.

Logically, these are two totally separate things, and IMO they should be cleanly separated.

What we *should* have is the notion of “I/O parameters” that are defined by the device… max I/O size, max queue depth, preferred I/O size, preferred I/O alignment, physical block size, logical block size, etc.  The base values for these parameters should come from the leaf device (e.g.. the disk), and then be clamped as needed by it’s connective tissue (the controller, the system bus the controller is connected to, and ultimately the platform-specific e.g. DMA constraints).

The the interface layers (the page cache / UBC, the traditional block I/O buffer cache, and the physio interface for user-space) can further impose their own constraints, as necessary per their API contract.  There is zero reason that MAXPHYS should impact the maximum I/O that a file system can do via the UBC, for example.

(In a perfect world, we wouldn’t even have to consume virtual address space to bring data and and out of the page cache / UBC, because we already know the physical addresses of the pages that are being pulled in / cleaned.)

> Any thoughts or news would be greatly appreciated.

Anyway, there are mine :-)

-- thorpej



Home | Main Index | Thread Index | Old Index