Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: svs_pmap_sync() svs_pdir_switch() overhead



On Sun, May 24, 2020 at 09:42:33PM +0000, Andrew Doran wrote:
> On Sat, May 23, 2020 at 09:04:26AM +0200, Maxime Villard wrote:
> > Le 20/05/2020 ? 11:12, Manuel Bouyer a ?crit?:
> > > On Tue, May 19, 2020 at 10:51:52PM +0000, Andrew Doran wrote:
> > > > Both of these show up prominently in profiling for me.  This change largely
> > > > cures it:
> > > > 
> > > > 	http://www.netbsd.org/~ad/2020/svs.diff
> > > > 
> > > > Comments?
> > 
> > I didn't know kcpuset_isotherset() existed, that is indeed better.
> > 
> > Not sure the second part is correct though. Suppose:
> > 
> > cpu0 is executing svs_pdir_switch(), and cpu1 is modifying the PTEs at the
> > same time. cpu0 returns to userland before cpu1 finished. [XXX]. cpu1
> > finishes, and calls svs_pmap_sync().
> > 
> > In the [XXX] window, the PTEs could be used by userland. If you copied
> > them using memcpy(), some parts of the bytes could contain stale values.
> 
> You mean if memcpy() was moving single bytes at a time?  That won't happen,
> see:
> 
> https://nxr.netbsd.org/xref/src/common/lib/libc/arch/x86_64/string/bcopy.S
> 
> Cheers,
> Andrew

Ah I see the compiler is up to its tricks here, so the assumption is not
quite so safe, and despite usually being very clever it's memset is daft.
The removal of bcopy(), bcmp() and bzero() was a mistake.  Anyway I'll do
it differently.

Andrew

Dump of assembler code for function svs_pdir_switch:
   0x000000000000074e <+0>:       push   %rbp
   0x000000000000074f <+1>:       mov    %rsp,%rbp
   0x0000000000000752 <+4>:       push   %r13
   0x0000000000000754 <+6>:       push   %r12
   0x0000000000000756 <+8>:       push   %rbx
   0x0000000000000757 <+9>:       sub    $0x18,%rsp
   0x000000000000075b <+13>:      mov    %rdi,%r13
   0x000000000000075e <+16>:      mov    %gs:0x388,%rbx
   0x0000000000000767 <+25>:      mov    0x908(%rbx),%rdx
   0x000000000000076e <+32>:      mov    0xa8(%rdi),%rax
   0x0000000000000775 <+39>:      or     0x0(%rip),%rax        # 0x77c <svs_pdir_switch+46>
   0x000000000000077c <+46>:      mov    %rax,(%rdx)
   0x000000000000077f <+49>:      lea    0x8e0(%rbx),%r12
   0x0000000000000786 <+56>:      mov    %r12,%rdi
   0x0000000000000789 <+59>:      callq  0x78e <svs_pdir_switch+64>
   0x000000000000078e <+64>:      mov    0x8c8(%rbx),%rdi
   0x0000000000000795 <+71>:      mov    0xa0(%r13),%rsi
   0x000000000000079c <+78>:      mov    $0x7f8,%eax
   0x00000000000007a1 <+83>:      test   $0x1,%dil
   0x00000000000007a5 <+87>:      jne    0x802 <svs_pdir_switch+180>
   0x00000000000007a7 <+89>:      test   $0x2,%dil
   0x00000000000007ab <+93>:      jne    0x81a <svs_pdir_switch+204>
   0x00000000000007ad <+95>:      test   $0x4,%dil
   0x00000000000007b1 <+99>:      jne    0x831 <svs_pdir_switch+227>
   0x00000000000007b3 <+101>:     mov    %eax,%ecx
   0x00000000000007b5 <+103>:     shr    $0x3,%ecx
   0x00000000000007b8 <+106>:     rep movsq %ds:(%rsi),%es:(%rdi)
   0x00000000000007bb <+109>:     test   $0x4,%al
   0x00000000000007bd <+111>:     je     0x7c0 <svs_pdir_switch+114>
   0x00000000000007bf <+113>:     movsl  %ds:(%rsi),%es:(%rdi)
   0x00000000000007c0 <+114>:     test   $0x2,%al
   0x00000000000007c2 <+116>:     je     0x7c6 <svs_pdir_switch+120>
   0x00000000000007c4 <+118>:     movsw  %ds:(%rsi),%es:(%rdi)
   0x00000000000007c6 <+120>:     test   $0x1,%al
   0x00000000000007c8 <+122>:     je     0x7cb <svs_pdir_switch+125>
   0x00000000000007ca <+124>:     movsb  %ds:(%rsi),%es:(%rdi)
   0x00000000000007cb <+125>:     mov    %r12,%rdi
   0x00000000000007ce <+128>:     callq  0x7d3 <svs_pdir_switch+133>


Home | Main Index | Thread Index | Old Index