Re: CVS commit: src

To: maya%NetBSD.org@localhost, source-changes-d%netbsd.org@localhost
Subject: Re: CVS commit: src
From: Maxime Villard <max%m00nbsd.net@localhost>
Date: Tue, 5 Nov 2019 22:23:50 +0100

Le 05/11/2019 à 22:01, maya%NetBSD.org@localhost a écrit :

On Tue, Nov 05, 2019 at 08:19:18PM +0000, Maxime Villard wrote:

Module Name:	src
Committed By:	maxv
Date:		Tue Nov  5 20:19:18 UTC 2019

Modified Files:
	src/share/mk: bsd.sys.mk
	src/sys/arch/amd64/amd64: machdep.c mptramp.S
	src/sys/arch/amd64/conf: GENERIC Makefile.amd64
	src/sys/arch/x86/x86: cpu.c
	src/sys/conf: files
	src/sys/kern: files.kern
	src/sys/lib/libkern: libkern.h
	src/sys/sys: atomic.h bus_proto.h cdefs.h systm.h
Added Files:
	src/sys/arch/amd64/include: csan.h
	src/sys/kern: subr_csan.c
	src/sys/sys: csan.h

Log Message:
Add Kernel Concurrency Sanitizer (kCSan) support. This sanitizer allows us
to detect race conditions at runtime. It is a variation of TSan that is
easy to implement and more suited to kernel internals, albeit theoretically
less precise than TSan's happens-before.

We do basically two things:

  - On every KCSAN_NACCESSES (=2000) memory accesses, we create a cell
    describing the access, and delay the calling CPU (10ms).

  - On all memory accesses, we verify if the memory we're reading/writing
    is referenced in a cell already.

The combination of the two means that, if for example cpu0 does a read that
is selected and cpu1 does a write at the same address, kCSan will fire,
because cpu1's write collides with cpu0's read cell.

The coverage of the instrumentation is the same as that of kASan. Also, the
code is organized in a way similar to kASan, so it is easy to add support
for more architectures than amd64. kCSan is compatible with KCOV.

Reviewed by Kamil.


I don't understand how you can distinguish a race from this condition:

	CPU0			CPU1


				mutex_enter
				write (recorded to cell)
				mutex_exit

	read (checked against record)

Which is legitimate.


When you "record", it remains recorded for 10ms, but it is then cleared
once the cpu continues execution. Ie, the "longevity" of the write above
is increased, and in that time period each access by another cpu will be
detected.

That's a watchpoint-based algorithm, different from the happens-before
algorithm, which is more powerful but also significantly more complicated
to implement in kernels (a lot of implicit synchronization, etc).

+       for (i = 0; i < ncpu; i++) {
+               __builtin_memcpy(&old, &kcsan_cpus[i].cell, sizeof(old));
+
+               if (old.addr + old.size <= new.addr)
+                       continue;
+               if (new.addr + new.size <= old.addr)
+                       continue;
+               if (__predict_true(!old.write && !new.write))
+                       continue;
+               if (__predict_true(kcsan_access_is_atomic(&new, &old)))
+                       continue;
+
+               kcsan_report(&new, cpu_number(), &old, i);
+               break;
+       }

It looks like you are checking the current CPU too?


Yes, but that doesn't matter, since on the current CPU the cell is
cleared.

References:
- Re: CVS commit: src
  - From: maya

Prev by Date: Re: CVS commit: src
Next by Date: Re: CVS commit: src
Previous by Thread: Re: CVS commit: src
Indexes:

Home | Main Index | Thread Index | Old Index