Re: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"

To: gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,michael.cheponis%gmail.com@localhost
Subject: Re: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Wed, 6 Mar 2024 13:25:01 +0000 (UTC)

The following reply was made to PR bin/58001; it has been noted by GNATS.

From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
Date: Wed, 06 Mar 2024 20:20:36 +0700

     Date:        Wed,  6 Mar 2024 07:40:01 +0000 (UTC)
     From:        michael.cheponis%gmail.com@localhost
     Message-ID:  <20240306074001.47D251A923B%mollari.NetBSD.org@localhost>

   | systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"

 This looks to be an issue with how systat collects buffer cache info.
 ( src/usr.bin/systat/bufcache.c fetchbufcache() )

 It starts by asking the kernel how much memory it needs to malloc()
 to fetch the data, allocates that much, and then tries to actually fetch.

 There's an obvious (and unavoidable) race there - if the amount required
 has grown between the request for how much, and the request to fetch into
 that buffer, then the 2nd will fail (insufficient space provided).

 systat anticipates that, if that happens it goes back and tries again,
 but this time adds 100 bytes to the amount the kernel says is required.

 If that attempt fails again (the same procedure as the first time, but
 this time anticipating the kernel is likely to actually want to send more
 data in the 2nd call than it claims it will in the first) then systat
 tries again, with 200 bytes extra instead of 100, and again, and again
 until it is allowing 1000 extra bytes from the first call's result in
 the second call.

 If that still isn't enough, systat gives up, and you get the error above.

 The "Cannot allocate memory" is something of a misnomer, and probably
 shouldn't be included - that's just because sysctl(2) is returning ENOSPC
 to indicate that the buffer provided isn't big enough for the data to
 be returned - it has nothing whatever to do with a malloc() failure
 or similar.   A better error message would be "requirement changing
 too quickly".

 My guess is that if you start that dd, wait for a while (maybe 10 or
 20 minutes) until things have stabilised a little, and then try the
 vmstat, it will work just fine, as by then the number the kernel returns
 in the first ('how much buffer space do I need') sysctl() call, will be
 close enough to what is actually needed for the algorithm systat implements
 to work OK.

 I'm not sure about the "100" though, or perhaps more the sequence 100, 200,
 300, ... 1000 - I think I'd be doing more like 100 200 400 800 1600, or
 perhaps even better 1024 2048 4096 8192 ... to reduce the chances of
 rapidly increasing data requirements from causing this particular issue.

 kre

Prev by Date: port-vax/58002: NetBSD-VAX network stack or drivers broken since NetBSD 10?
Next by Date: Re: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
Previous by Thread: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
Next by Thread: Re: bin/58001: systat -w1 vmstat reports "Cannot get buffers;Cannot allocate memory"
Indexes:

Home | Main Index | Thread Index | Old Index