At Tue, 01 Nov 2011 01:43:23 -0700, "Greg A. Woods" <woods%planix.ca@localhost> wrote: Subject: Re: getrusage() problems with user vs. system time reporting > > Indeed with all this doom and gloom about TSC it seems it might be > better just to use binuptime() -- that probably won't be as fast > though.... Perhaps if I'm inspired tomorrow I'll try to re-do the > sysctl_stats.h macros to use it instead of cpu_counter32(), and then use > this real measure of system time in calcru() instead of pretending the > stathz ticks mean anything. So, I've done that now. (including removing dependency on SYSCALL_COUNTS and SYSCALL_TIMES, etc.; all except for figuring out how to hook in the ISR hooks...) Seems binuptime() is indeed way too expensive to run at every mi_switch(), syscall(), etc. (it more than doubles the time it takes for gettimeofday() to run), but getbinuptime() seems to be sufficiently low-cost to use in these situations. Unfortunately getbinuptime() isn't immediately looking a whole lot better than the statistical sampling in statclock(), though perhaps, with enough benchmark runtime, it is, as expected, being _much_ more fair at splitting between user and system time. At least this is the case on a VirtualBox VM. I need to see it on real hardware next. With some further analysis, and with addition of new time values to struct proc so that statclock() ticks can also be accounted (right now I re-use the p_*ticks storage for 64-bit nanosecond values), it may be possible to come up with a simple algorithm so that calcru() can use balance out the difference between the getbinuptime() values and the true binuptime() stored in {l,p}_rtime. Storing the raw getbinuptime() values would also avoid having to do 64-bit adds in mi_switch(), syscall(), et al. If anyone's interested in more details I can post some of my results, as well as the changes I've made. Any comments about this would be appreciated! One thing that's confusing me is that though normally for short-running processes I'm seeing the getbinuptime() values be either zero, or somewhat less than the binuptime() value from p_rtime, on rare occasions I also see "vastly" larger getbinuptime() values. For example (this from calcru(), as it is called from in kern_exit.c): exit|tty: atrun[377]: rt=2216 ms, u+s=0 ms (u=0 ns / s=0 ns), it=0 ticks exit|tty: atrun[361]: rt=2066 ms, u+s=0 ms (u=0 ns / s=0 ns), it=0 ticks exit|tty: atrun[409]: rt=2207 ms, u+s=0 ms (u=0 ns / s=0 ns), it=0 ticks exit|tty: atrun[130]: rt=2272 ms, u+s=1694 ms (u=0 ns / s=1694070 ns), it=0 ticks exit|tty: atrun[434]: rt=4048 ms, u+s=10151 ms (u=10151849 ns / s=0 ns), it=0 ticks exit|tty: atrun[162]: rt=3209 ms, u+s=0 ms (u=0 ns / s=0 ns), it=0 ticks exit|tty: atrun[458]: rt=2576 ms, u+s=0 ms (u=0 ns / s=0 ns), it=0 ticks "rt" is the real-time (p_rtime) value calculated by calcru(), in ms "u" is the accumulated user time from getbinuptime() calls, in ns "s" is the accumulated system time from getbinuptime() calls, in ns "it" is the old-style statistically gathered p_iticks value "u+s" is of course "u" + "s", converted to ms (the longest running sample above was when the VM was idle except for cron, but the VirtualBox host, my desktop, may have been quite busy) -- Greg A. Woods Planix, Inc. <woods%planix.com@localhost> +1 250 762-7675 http://www.planix.com/
Attachment:
pgpeOLPbXmduH.pgp
Description: PGP signature