On 1/10/24 20:26, Jason Bacon wrote:
On 1/10/24 16:00, Taylor R Campbell wrote:Date: Wed, 10 Jan 2024 08:29:32 -0600 From: Jason Bacon <jtocino%gmx.com@localhost> On 12/29/23 11:14, Jason Bacon wrote:On 12/29/23 08:38, Martin Husemann wrote:This sounds pretty expensive, can't it use os.sysconf(NPROCESSORS_CONF) ?Probably. I was just following the other examples in the code, as I'm no pythonista, but your suggestion sounds like a better approach. https://github.com/joblib/joblib/issues/1535As per the github discussion, we'll have to stick with subprocesses for now. sysconf does not return the desired core count and there is no portable sysctl interface for python at this time. The pypi sysctl package is FreeBSD-specific, while completely different implementations exist for NetBSD and Linux.I'm confused, what number is wrong and why do you need sysctl at the moment? I understand we might add sysctls in the future to express the full topology, including the gory details of hyperthreading and big.LITTLE and newer finer-grained variants thereof or whatever -- but for now you're just going for the number of configured or online `cores' (i.e., threads), right?No, upstream wants the number of physical cores (hw.physicalcpu on Darwin, kern.smp.cores on FreeBSD) rather then hyperthreads, to avoid oversubscribing the host when the default "use all available cores" is invoked. sysconf cannot provide that, nor can sysctl on NetBSD at the moment. For the moment, py-joblib patched to use hw.ncpu on NetBSD, with a comment indicating that this is not ideal. It's better than erroring out, though.
BTW, while you're in there creating new sysctls, it might be worthwhile to devise an easy way to turn hyperthreading off entirely. Most HPC clusters have it disabled to avoid oversubscription of compute nodes. The consensus is that running without hyperthreading leads to better overall performance across a variety of applications.
When I was running HPC clusters, there was no easy way to disable hyperthreading on Linux, so it had to be done in the BIOS. FreeBSD has a sysctl, machdep.hyperthreading_allowed, which makes it easy, though it can only be set during boot.