tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD truss(1), coredumper(1) and performance bottlenecks



On 25.05.2019 04:19, Robert Elz wrote:
>     Date:        Sat, 25 May 2019 02:04:13 +0200
>     From:        Kamil Rytarowski <n54%gmx.com@localhost>
>     Message-ID:  <4fefdf41-44fa-12f9-705d-5187732d7c95%gmx.com@localhost>
> 
> 
>   | As far as I'm aware we can use read(2) and write(2) in pipes with longer
>   | transfers than 1 byte.
> 
> Of course.  But once read we cannot go back (which can be done reading
> a file).

That was just a starting inspiration to look around. In my test I only
used it against plain 'build.sh' without any arguments, so actually
doing nothing interesting except printing usage text.

I've picked build.sh as this is the only (?) script that is heavy in the
native basesystem and could be useful to be optimized.

I gave a tool that could be of some use here, feel free to experiment
with it. I cannot focus on performance myself in closer time.

I was asked for truss(1) by Christos back some time ago. So here it is.

> Yes, gettimeofday() is very common - but we need to investigate how
> to speed it up, not just presume that a mapped page is the right answer.

One of the arguments for VDSO is that it's implemented in all other
major OSs (at least UNIX-like ones): Lin/Mac/FreeBSD. Today developers
treat gettimeofday() as adequate for polling and we cannot do much with
this.

Running ./build.sh for release should be a good benchmark to check
whether this helps or not.

>   | At some point of time Joyent optimized bulk builds of pkgsrc from 2 days
>   | to 3 h. There are certainly low-hanging fruits in build.sh as well.
> 
> I am sure there are, but I very much doubt that build.sh is really something
> itself that ought to be a target of investigation.   All it is is a wrapper
> around make.

truss -f is for inherit/track forks so it tracks all children programs
including make.

./build.sh is similar to pkgsrc (bulk) - a combination of make(1),
sh(1), awk(1), grep(1) and similar in a single chain.

I bet we can have low hanging fruits for single threaded builds and room
for making it more parallel.

>   All the real work is done in make, and all that it calls.
> Speeding up build.sh itself is very unlikely to change anything, unless we
> can find entire runs of make that we can optimise away.
> 
>   | I'm not sure that this would be a real concern here to skip gettimeofday
>   | calls in strace-like programs.
> 
> One potential solution might be to find a way to make combined syscalls,
> where one user/kernel boundary crossing performs multiple syscalls.
> 

Personally I wouldn't bother with this. A simple profiler for such calls
if we really need tham can be done with LD_PRELOAD and overwriting
certain libc calls. Certainly there are other options with
instrumentation (dynamorio? valgrind?).

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index