Subject: Re: Sparseness of kernel structures on i386?
To: None <frank@fwi.uva.nl, port-i386@NetBSD.ORG>
From: Thor Lancelot Simon <tls@panix.com>
List: port-i386
Date: 12/08/1996 09:20:46
> Well, currently the kernel has a max of 124Mb of VA space (just compute
> VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS). Since you need one PDE
> per 4 Mb (one PDE -> 1024 PTEs -> 1024 * 4096  = 4Mb), you can have 
> a maximum of 31 PDEs. On the i386 the kernel lives in the same VA space
> as userland, so you do want to restrict the space it takes somehow.

>> Theo was kind enough to point out to me that OpenBSD has the following code
>> in machdep.c, which would seem easy enough to pick up:

>>         /* Restrict to at most 70% filled kvm */
>>         if (bufpages * MAXBSIZE * 7 / 10 >
>>             (VM_MAX_KERNEL_ADDRESS-VM_MIN_KERNEL_ADDRESS))
>>                 bufpages = (VM_MAX_KERNEL_ADDRESS-VM_MIN_KERNEL_ADDRESS) /
>>                     MAXBSIZE * 7 / 10;

Actually, if this calculation is done with nbuf instead of bufpages, I think
it yields a better result: at least all of the memory we intended to have
available for buffers _can_ be used, though it likely won't.  We still lose,
but not quite as badly. :-/  So my end result looks like this:

        if (bufpages == 0)
                if (physmem < btoc(2 * 1024 * 1024))
                        bufpages = physmem / (10 * CLSIZE);
                else
                        bufpages = (btoc(2 * 1024 * 1024) + physmem) /
                            (20 * CLSIZE);
                
        if (nbuf == 0) {
                nbuf = bufpages;
                if (nbuf < 16)
                        nbuf = 16; 
        }

        /* Restrict to at most 70% filled kvm */
        if (nbuf * MAXBSIZE * 7 / 10 >    
            (VM_MAX_KERNEL_ADDRESS-VM_MIN_KERNEL_ADDRESS))
                nbuf = (VM_MAX_KERNEL_ADDRESS-VM_MIN_KERNEL_ADDRESS) /    
                    MAXBSIZE * 7 / 10;

> Yeah, I thought about doing this just then, I didn't know OpenBSD already
> had that. What is a good value though? 70%? It's guessing. But of course,
> the buffer space computation is based on the same sort of percentage
> calculation as well, so..

This just doesn't seem right to me, any of it.  If the user chooses to
override certain values like bufpages, nbuf, nkpde, or SHMMAX, we ought to
adjust the maximum sizes of the larger structures to accomodate him.  Is there
some hideously complicated reason we can't add up the sizes, and _then_ set
VM_MAX_KERNEL_ADDRESS?  I realize that this would be a reasonably substantial
reworking of locore, but it's mostly formulaic, right?

A news server with 128MB of memory and 16GB of disk is going to be an awful
lot happier with a 64MB buffer cache and 3GB of available user VA space than
with a tiny buffer cache and more space available for user code.

With the buffer cache, obviously the underlying problem is that we allocate
MAXBSIZE VA space for each buffer, which is mostly always empty.  I understand
this just barely well enough to have an idea of the size of the hornet's nest
I'm about to hit with a stick, but why not give the damned space to the kernel
malloc and just change allocbuf to use it?  This whole problem seems to result
from the stupidity of the private allocator used for buffers.  If we can't do
that, what about adding a b_maxsz field to struct buf, and initially
allocating buffers of different sizes?  (Actually, that seems like much more
work, but what do I know?)

Thor