tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Heads up: moving some uvmexp stat to being per-cpu



On Dec 15, 2010, at 5:35 AM, Matthew Mondor wrote:

> On Tue, 14 Dec 2010 20:49:14 -0800
> Matt Thomas <matt%3am-software.com@localhost> wrote:
> 
>> I have a fairly large but mostly simple patch which changes the stats 
>> collected in
>> uvmexp for faults, intrs, softs, syscalls, and traps from 32 bit to 64 bits 
>> and
>> puts them in cpu_data (in cpu_info).  This makes more accurate and a little 
>> cheaper
>> to update on 64bit systems.
> 
> I like the cleanliness of the changes;
> 
> A potential issue I see is how heavy this becomes on some 32-bit CPUs
> i.e. m68k, where I see for instance 1 instruction being replaced by 9
> instructions (including registers save/restore) to increment a
> counter.  I'm not sure if in practice this will really affect
> performance, or if it's worth benchmarking for those architectures,
> however.

Here's the original assembly:

00000000 <orig>:
   0:   52b9 0000 0000  addql #1,0 <orig>
   6:   53b9 0000 0000  subql #1,0 <orig>
   c:

If we put idepth in cpu_info, we can use the fact that 
&cpu_info_store.ci_data.cpu_nintr is an address register
and use that to access ci_idepth it's only 8 bytes longer.

00000000 <lea_for_cpuinfo_nintr_plus_4_and_idepth>:
   0:   41f9 0000 0000  lea 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%a0
   6:   5290            addql #1,%a0@
   8:   4280            clrl %d0
   a:   2220            movel %a0@-,%d1
   c:   d380            addxl %d0,%d1
   e:   2081            movel %d1,%a0@
  10:   53a8 004c       subql #1,%a0@(76)
  14:   

which saves two bytes over not doing that:

00000040 <lea_for_cpuinfo_nintr_plus_4>:
  40:   41f9 0000 0000  lea 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%a0
  46:   5290            addql #1,%a0@
  48:   4280            clrl %d0
  4a:   2220            movel %a0@-,%d1
  4c:   d380            addxl %d0,%d1
  4e:   2081            movel %d1,%a0@
  50:   53b9 0000 0000  subql #1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>
  56:   

Now if we have the address register to point to cpu_info
and have ci_idepth, it's a bit longer.

00000080 <lea_for_cpuinfo>:
  80:   41f9 0000 0000  lea 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%a0
  86:   52a8 00e4       addql #1,%a0@(228)
  8a:   4280            clrl %d0
  8c:   2228 00e0       movel %a0@(224),%d1
  90:   d380            addxl %d0,%d1
  92:   2141 00e0       movel %d1,%a0@(224)
  96:   53a8 012c       subql #1,%a0@(300)
  9a:   

and we don't use lea at all it's 16 bytes more than the original:

000000c0 <nolea>:
  c0:   52b9 0000 0000  addql #1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>
  c6:   4280            clrl %d0
  c8:   2239 0000 0000  movel 0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>,%d1
  ce:   d380            addxl %d0,%d1
  d0:   23c1 0000 0000  movel %d1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>
  d6:   53b9 0000 0000  subql #1,0 <lea_for_cpuinfo_nintr_plus_4_and_idepth>
  dc:   



Home | Main Index | Thread Index | Old Index