tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Adding new feature - Kcov



Le 04/01/2019 à 16:58, Christos Zoulas a écrit :
In article <CAB5-aq6p6jAWKnJ=ZsjUwVuYg2Gg8k_jq-Ga9_XD_FOCpuzWuw%mail.gmail.com@localhost>,
Siddharth Muralee  <siddharth.muralee%gmail.com@localhost> wrote:
On Thu, 3 Jan 2019 at 22:39, Christos Zoulas <christos%astron.com@localhost> wrote:

In article
<CAB5-aq6SSL_rwoTHGOXh1hgYgOPp+hj4qyhQMwG=g+LuuZF+WQ%mail.gmail.com@localhost>,
Siddharth Muralee  <siddharth.muralee%gmail.com@localhost> wrote:
On Wed, 2 Jan 2019 at 23:12, Christos Zoulas <christos%astron.com@localhost> wrote:

In article
<CAB5-aq5KCpKOD1T9SFGQm2+NyhG4yFjQRit7F+4jFr=SMuFEmw%mail.gmail.com@localhost>,
Siddharth Muralee  <siddharth.muralee%gmail.com@localhost> wrote:
-=-=-=-=-=-

On Wed, 2 Jan 2019 at 09:26, Siddharth Muralee
<siddharth.muralee%gmail.com@localhost>
wrote:

Hello, I have attached a modified patch for kcov(4). I have modified
it to be a per-process lookup instead of the earlier per-unit lookup.

It seems to working fine. I have tested it with a couple of system
calls. There is however a lot of unnecessary output coming during a
simple system call. I have attached below the output of kcov(4) for
the system call `read(-1, NULL, 0)`. I would also like to get some
input on how to reduce the noise if possible.

Seems like the attachments got missed somewhere along the way (Not sure
what happened!).
kcov diff -
https://github.com/R3x/scratch-files/blob/master/kcov/kcov.diff
output of kcov(4) for read() -
https://github.com/R3x/scratch-files/blob/master/kcov/sample_output.txt

What source version do those line numbers correspond it?

It is of NetBSD-current probably a week old. I have made this one with
the latest version(from GitHub) to avoid any confusion
latest output -
https://github.com/R3x/scratch-files/blob/master/kcov/latest_output.txt

Thanks! The way other profilers summarize the output is to identify common
code calling sequences, abstract them and refer to them by id. For example:

/src/sys/arch/x86/x86/pmap.c:3069
/src/sys/arch/x86/x86/pmap.c:3080
/repos/obj1/sys/arch/amd64/compile/GENERIC/./x86/pmap.h:411
/src/sys/arch/x86/x86/pmap.c:3028
/src/sys/arch/x86/x86/pmap.c:3029
/src/sys/arch/x86/x86/pmap.c:3031
/src/sys/arch/x86/x86/pmap.c:3028 (discriminator 1)
/src/sys/arch/x86/x86/pmap.c:3029
/src/sys/arch/x86/x86/pmap.c:3031
/src/sys/arch/x86/x86/pmap.c:3028 (discriminator 1)
/src/sys/arch/x86/x86/pmap.c:3029
/src/sys/arch/x86/x86/pmap.c:3031
/src/sys/arch/x86/x86/pmap.c:3034
/src/sys/arch/x86/x86/pmap.c:3035
/src/sys/arch/x86/x86/pmap.c:3037
/src/sys/arch/x86/x86/pmap.c:3088
/src/sys/arch/x86/x86/pmap.c:3092
/src/sys/arch/x86/x86/pmap.c:3097
/src/sys/arch/x86/x86/pmap.c:3100

is repeated; also putting function names next to line numbers is also helpful
to identify stack traces. What tools are other people using to post-process
this output?

christos


Kcov only returns the addresses of all the functions that it traced
along the way. I have passed the output to addr2line(1) to get the
above output. I can use the -f option that is available as a part of
it to print out the function names if necessary.

What I meant by excess noise is all the pmap functions and uvm
functions that are coming in the output trace. In this case I am
trying to predict the output path of the syscall read with the
arguments (-1, NULL, 0). The ideal output should only contain the
functions that get executed during the system call (The uvm and pmap
function aren't really relevant to the system call). The use case of
Kcov we wish to take advantage of is to use it for coverage guided
fuzzing with syzkaller - mainly in the starting for the system call
layer. We only need the coverage for the paths that are executed with
reference to the input(arguments of the syscall in this case).

You can take a look at the output for the same system call in linux -
https://github.com/R3x/scratch-files/blob/master/kcov/linux_output.txt

If you look at my trace with functions here -
https://github.com/R3x/scratch-files/blob/master/kcov/netbsd_output_with_functions.txt
You can see that the topmost functions are similar to what we saw in
Linux but there are a lot of uvm and pmap functions after that.
Especially the trap part.
In this case the pmap functions, mutex functions etc have no relevance
to the syscall input and hence we would like to avoid that part.

__sanitizer_cov_trace_pc is the function which is compiler
instrumented to add the trace coverage to the buffer. In this function
- I am disabling the trace during the boot time period and also during
interrupts. Currently checking whether we are in an interrupt by `
curcpu()->ci_idepth >= 0 `. This check not working properly could be
what is  causing the noise.

That is what cpu_intr_p() is doing, so it should work... I would try
to debug why it does not. Perhaps the pmap functions are called from
allocators? Anyway finding a way to create a call graph out of the
profiled samples would be useful.

christos

Interrupt != exception. When a page fault comes in, there's no flag that is
set in proc/lwp/curcpu, so you can't know if you are in an exception context;
ci_idepth is unrelated.

Of course we could add such a flag under #ifdef KCOV and then check for this
flag in __sanitizer_cov_trace_pc.

But before that, it would be good to make sure that the extra output is
indeed noise (and not something the fuzzer expects). Because a lof of things
we do in exception context may contain bugs, and we want to fuzz all of that.

Maybe check what Linux does?


Home | Main Index | Thread Index | Old Index