Hi,
I'm running NetBSD-current on one of my 1G Mac Mini G4 systems,
doing pkgsrc bulk building.
This go-around I've managed to build llvm, and next up is rust. This
is proving to be difficult -- my system will consistently wedge it's
user-land (still responding to ping, no response on the console or any
ongoing ssh sessions; well, not entirely correct, it will echo one
carriage-return on the console with a newline, but then that is wedged
as well). Also, I have still not managed to break into DDB on this
system, so each and every time I have to power-cycle the box. This
also means that all I have to go on is output from "top -s 1", "vmstat
1" and "systat vm", and this is the latest information I got from
these programs when it wedged just now:
load averages: 1.10, 1.13, 1.05; up 0+02:01:45 21:59:52
103 threads: 5 idle, 6 runnable, 90 sleeping, 1 zombie, 1 on CPU
CPU states: 1.0% user, 5.9% nice, 93.1% system, 0.0% interrupt, 0.0% idle
Memory: 559M Act, 274M Inact, 12M Wired, 186M Exec, 162M File, 36K Free
Swap: 3026M Total, 80M Used, 2951M Free / Pools: 134M Used
PID LID USERNAME PRI STATE TIME WCPU CPU NAME COMMAND
6376 26281 1138 78 RUN 2:03 89.10% 88.96% rustc rustc
0 109 root 126 pgdaemon 0:20 15.48% 15.48% pgdaemon [system]
733 733 he 85 poll 0:14 2.93% 2.93% - sshd
164 164 he 85 RUN 0:06 1.17% 1.17% - systat
Notice the rather small amount of "Free" memory, and the rather
high rate of system CPU. The "vmstat 1" output for the last few
seconds:
procs memory page disk faults cpu
r b avm fre flt re pi po fr sr w0 in sy cs us sy id
1 0 634804 4164 1869 0 0 0 1358 1358 0 280 0 425 97 3 0
3 0 637876 1016 786 0 0 0 0 0 0 213 0 410 99 1 0
2 0 636336 2512 816 4 0 0 1192 1202 0 326 0 508 98 2 0
2 0 633448 5456 617 0 0 0 1355 1371 0 228 0 374 99 1 0
2 0 634964 3780 430 0 0 0 0 0 0 250 0 452 98 2 0
2 0 635988 2740 260 0 0 0 0 0 0 261 0 496 98 2 0
2 0 637396 1376 386 0 0 0 0 0 0 300 0 459 97 3 0
2 0 634912 4060 775 0 0 0 1354 1354 0 190 0 245 100 0 0
2 0 636940 2308 437 0 0 0 0 0 0 250 0 415 100 0 0
2 0 637912 1064 473 0 0 0 0 0 0 251 0 406 100 0 0
2 0 633580 5408 175 0 0 0 1262 1270 0 254 0 403 99 1 0
2 0 637288 1740 1002 0 0 0 0 0 0 278 0 521 97 3 0
2 0 634340 4324 713 0 0 0 1354 1357 0 296 0 471 96 4 0
2 0 636388 2160 540 0 0 0 0 0 0 216 0 361 98 2 0
2 0 637412 1116 258 0 0 0 0 0 0 254 0 405 98 2 0
2 0 637556 4872 178 12 0 996 1122 42861 4 307 0 442 30 70 0
2 0 638064 9620 1105 3 0 1228 1228 2305 70 411 0 667 19 81 0
2 0 639624 7416 550 0 0 0 0 0 0 319 0 584 97 3 0
2 0 644744 2200 1299 0 0 0 0 0 0 279 0 416 93 7 0
6 0 646924 2716 537 0 0 1356 672 2403 14 412 0 497 35 65 0
4 0 654792 36 2022 32 0 1354 1366 7910 91 241 0 6735 7 93 0
while "systat vm" doesn't really give any more information than
the above:
6 users Load 1.10 1.13 1.05 Thu Sep 8 21:59:51
Proc:r d s Csw Traps SysCal Intr Soft Fault PAGING SWAPPING
8 3 355 471 302 75 398 in out in out
ops 64
68.2% Sy 0.0% Us 31.8% Ni 0.0% In 0.0% Id pages 1027
| | | | | | | | | | |
==================================---------------- forks
fkppw
Anon 509096 50% zero 472 Interrupts fksvm
Exec 190804 18% wired 12000 100 cpu0 clock pwait
File 166072 16% inact 280984 openpic irq 29 relck
Meta 82832 2% bufs 6500 openpic irq 63 rlkok
(kB) real swaponly free 38 openpic irq 39 1 noram
Active 570368 73812 2716 openpic irq 40 11 ndcpy
Namei Sys-cache Proc-cache 167 openpic irq 41 fltcp
Calls hits % hits % 167 gem0 interrupts 397 zfod
6 6 100 cow
256 fmin
Disks: cd0 wd0 341 ftarg
seeks itarg
xfers 14 flnan
bytes 920K 509 pdfre
%busy 1.4 1820 pdscn
Hm, what does "noram" mean? Is that flagging "imminent wedge"?
I have reduced kern.maxvnodes down from the default of around 50000 to
20000, and have tried adjusting vm.filemax and vm.anonmax down
slightly (50 to 30?), and with that the build log managed to progress
past this point (something it didn't do earlier, i.e. this was the
"wedge point"):
Compiling globset v0.4.5
Compiling ignore v0.4.17
Compiling toml v0.5.7
but regrettably the build wedged relatively shortly thereafter
again.
I've managed to build rust 1.62.1 on an identical system running
NetBSD 8.0, and rust 1.60.0 on an identical system running 9.0 as
part of a bulk build, and 1.63.0 on a dual-CPU "mirror drive
door" system running -current as well (2G memory) without issues.
However compared to the 8.0 and 9.0 systems which are identical
in hardware (1G memory), this looks like a regression in overall
stability for these systems.
I'm wondering what my next remedy should be to nudge this system
to get further in the build process? Still further reduce
kern.maxvnodes? It seems to me that it is a little too easy to
make the system wedge under memory pressure, and it appears to be
VM system related, but that's about as far as I'm able to guess.
The -current kernel I was running earlier was from June 6, the one I'm
running now is from August 28, first with June 6 user-land but now
with August 28 user-land. None of those updates appear to have made
any difference to this problem.
Help, hints and suggestions are welcome.
Regards,
- Håvard