Peter Eisch <peisch%gmail.com@localhost> writes: > Greg, would you mind reading my tea leaves? How should I read the mcl > stats from the following systems? You should figure out how to not miswrap log/diag output :-) A quick look shows 0 Fail, which means none of the machines ran out, which means you had high enough NMBCLUSTERS. Most have not that many allocated, and no releases, which means the systems aren't doing that much. doily has a huge request/release, but hiwat is only 771 vs 333 npage, so it seems to have a repeated moderate use. To really understand this read the following papers (in order); The Slab Allocator: An Object-Caching Kernel Memory Allocator by Jeff Bonwick http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.4759 Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources by Jeff Bonwick (Sun) and Jonathan Adams http://www.usenix.org/event/usenix01/bonwick.html Part of the point of pools is to avoid the need for tuning by having efficient autosizing. But the hard part is having memory pressure cause distributed hints to free not-so-needed memory, and avoid one use from hurting the machine. So this is hard, like managing disk quotas. Probably with modern adapators NMBCLUSTERS should have a higher default limit based on avaiable KVA and physmem. > At this point the systems above survive unless traffic through them > dramatically changes. Maybe my fiddling with NMBCLUSTERS has only > coincidentally resolved the issues? I wouldn't say coincidentally. When you raise NMBCLUSTERS to the point where typical usage doesn't cause you to run out, you have tuned properly. It would be nice if the pool stats had a high-water mark for clusters (vs pages). But if you multiply pages by 2 it will be close.
Attachment:
pgp7rQwtRCSQG.pgp
Description: PGP signature