Subject: VM system with ARM32 port
To: None <tech-kern@NetBSD.ORG>
From: Neil A. Carson <neil@causality.com>
List: tech-kern
Date: 10/16/1997 11:08:42
Greetings!

Some notes, thoughts, queries, etc... Sorry if this lot isn't totally
technically coherent...

The present VM system in NetBSD doesn't work too efficiently on the
StrongARM: This is because of the virtual nature of the instruction and
data caches (things not cached by physical address), and also because
when the instruction cache is cleaned, it must be cleaned in its
entirety.

The performance problems with StrongARM/NetBSD/arm32 make themselves
most apparent when running shell scripts, or in general forking/execing
of binaries, and also during buffer-oriented IO because of the use of
pagemove. Basically, in (for example) pmap_enter, if the page in
question ever had a chance of containing instructions, the whole I-cache
has to be invalidated. Clearly this slows things down a lot :) The
result is that sh scripts consume hideous amounts of system time, and
the machine becomes slightly unusable whilst they are running.

What can be done about this? I've had some thoughts, on some general
read-throughs of the code---these may not be particularly accurate, but
I'd quite like some comments. Firstly, it would be nice to be able to
group operations on large number of pages: For example, at several
places in the VM code, I see loops with pmap_enter calls inside them.
Potentially if the whole I-cache is being invalidated at each run of
these loops---well you can work it out. Would the concept of a
pmap_enter_multiple() call make sense?

Within the buffer cache, pagemove() is assumed to not handle overlapping
areas. How often do overlapping areas really occur? Again, we have to
zap the I-cache in this function... If they happen reasonably often,
extending the pagemove() function so that it handles overlapping areas
(like bcopy() as opposed to ovbcopy()) could potentially help a lot.

Thoughts?

	Cheers and thanks,

	Neil