Re: ARM GPUs

To: Robert Swindells <rjs%fdy2.co.uk@localhost>
Subject: Re: ARM GPUs
From: <tlaronde%kergis.com@localhost>
Date: Thu, 18 Jul 2024 12:37:24 +0200
On Thu, Jul 18, 2024 at 09:50:43AM +0100, Robert Swindells wrote:
> 
> I have been working on trying to get accelerated graphics to work for
> ARM Ltd GPUs in NetBSD.
> 
> There are separate Linux and Mesa codebases for each of the Mali-400
> (lima) and Mali-Txxx (panfrost) GPUs.
> 
> Have run out of things to try and am not getting any response from core
> people either with comments on diffs or with any discussion on how to
> start getting this into the tree so that I can get more eyes looking at
> it.
> 
> Lima on a Pinebook seems to be doing something, I get FPS measurement
> output from the test app that I'm using, but the screen on my Pinebook
> is broken so I can't tell if it is displaying anything.
> 
> Panfrost on a Pinebook Pro doesn't do anything, the first shader sent to
> the GPU just times out. The main difference to how this runs on Linux is
> the MMU setup code, I can't see anything wrong with this.

I'm still on trying to untangle things related to GPUs, and I have
started on the "easy" part: userland (Xorg), in order to be able to
put aside or to correct problems coming from this part---I had
yesterday a crash on a git merge command while firefox was running,
with a 10.99.10 kernel running on amd64, with corruption of files, and
I'm almost certain this comes from firefox running i.e. problems with
handling of memory (the question being: is it a problem in userland
with futexes, since there is an implementation on NetBSD but unused so
the code runs differently on NetBSD compared to other systems;
or problems with kernel memory handling).

But the "Xorg" code is not anymore "Xorg" since even in userland there
are distinct groups, and the userland is composed of diverse parts,
some being almost orphaned.

Then comes the problem of the kernel part and, behind, the huge amount
of things that are not documented on the GPUs. Even this:

	General-Purpose Graphics Processor Architectures, by Tor M. Aamodt,
	Wilson Wai Lun Fung and Timothy G. Rogers, Springer, Synthesis
	Lectures on Computer Architecture

says that in order to give some idea about what is going on, they had
to read patents to complement official documentation, and that this
does not give a complete view about what is going on.

For GPUs, there are two distinct species: discrete GPU (GPU system on
card with own memory); integrated CPU and GPU, sharing cache and
memory.

Since the rush about GPUs, nowadays, has to do not only with "images"
but with cryptocurrency mining and A.I., the whole trend is a
collection of attempts, adding in a hurry things to the existing and
relying on what exists today---that probably will not stand the
test of time[*], and thus is already a cemetery, the first question
to answer is what conservative direction can be taken?

FWIW, for the moment my view---I hope others will expose their own so
that we confer diverging views---is:

1) I have made an error to start by the userland. The job has to be
done, but can be done in parallel with tackling the problem on the
kernel side;

2) The GPUs should be considered auxiliary specialized processors, and
the first thing to do is not to drive them, but to detect
them and to decide whether they will be driven;

3) GPUs should not be considered as giving access to a clock and a
display: rendering should be considered separated, the case of
this device (clock + display, in whatever flavor including simply
ligthing or unlighting a single led--- a "morse" console) being
accessed via a GPU SoC being considered a special case; and a clock
and a framebuffer being, also, separated from what display is driven
to render the framebuffer with whatever timing---all in all it
should be possible to have rendering on something that is even not
a raster, but a vectorial "display", like plotters of old times;

4) The first GPUs to support are the "integrated" models, since the
MMU is simply the core of a Unix like system and a kernel implements a
policy of access to resources; a discrete GPU seems more like a
tightly coupled distinct node, with a dedicated "system". Is it
possible to "integrate" correctly a discrete GPU if there are no
specific system calls insuring that some shared dedicated code via
MMU is implementing a correct policy of access to the resources?

5) For instructions sent to the GPU, there are already competing
programming models and libraries, allowing to formulate "generic"
instructions for various architectures. How these could be used for the
kernel part so that the kernel can have as little knowledge about the
internals of the GPU---the kernel drives the GPU, managing the
allocation of resources, but without knowing exactly what the GPU
does.

So, I'm going to look also on the kernel part, for ARM based SoCs I
have, starting with the 2) above. No timeline set since I have already
far too much on my stack---but I have no choice but eventually making
it since I have the feeling that everything is about to fall in
pieces, including in the software area, and that I can rely on less and 
less external things...

FWIW. Still wondering about this, when I have time...

*: GPUs are using SIMD; it is perhaps not the best approach, and
driving specialized auxiliary processors, with vectorial instructions,
the RISC-V way, is perhaps more sustainable.
--
        Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
                     http://www.kergis.com/
                    http://kertex.kergis.com/
                     http://nunc-et-hic.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
References:
- ARM GPUs
  - From: Robert Swindells
Prev by Date: ARM GPUs
Next by Date: pkgin needs root permission on arm64?
Previous by Thread: ARM GPUs
Next by Thread: pkgin needs root permission on arm64?
Indexes:
Home | Main Index | Thread Index | Old Index