tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD truss(1), coredumper(1) and performance bottlenecks



    Date:        Sat, 25 May 2019 08:47:46 +0200
    From:        Kamil Rytarowski <n54%gmx.com@localhost>
    Message-ID:  <9c91afa3-8fc5-1669-8b6b-035574137dcd%gmx.com@localhost>


  | >   | As far as I'm aware we can use read(2) and write(2) in pipes with
  | >   | longer transfers than 1 byte.
  | >
  | > Of course.  But once read we cannot go back (which can be done reading
  | > a file).
  |
  | That was just a starting inspiration to look around.

That's fine, but my comment (and Michael's similar one) were specifically
in response to your comment quoted above (the most indented one).  Not
about any of the rest of what you are doing.

  | In my test I only
  | used it against plain 'build.sh' without any arguments, so actually
  | doing nothing interesting except printing usage text.

That's interesting.  Used like that, build.sh doesn't do almost anything
at all, and in particular, doesn't do any read commands (it doesn't even
reach the part that was discussed the other day).

I suspect that what you were measuring, more than anything, was your
$ENV script - does that have "read" commands in it?

  | I've picked build.sh as this is the only (?) script that is heavy in the
  | native basesystem and could be useful to be optimized.

Again, I doubt that anyone would gain almost anything by optimising that
script.   Optimising what happens when you "cd /usr/src; make ..."
(with all the right options to make, and the environment set correctly)
would make lots of sense.

But:

  | I gave a tool that could be of some use here, feel free to experiment
  | with it.

That's great, and for testing the tool, you can use anything you like
as an object - and as far as it benefits the tool itself, that's all
good.  Just don't expect the results from that kind of testing to be
useful for much other than tool improvement - to actually use the tool
to help improve the system it has to run against something that actually
could do with improvement.

  | Personally I wouldn't bother with this. A simple profiler for such calls
  | if we really need tham can be done with LD_PRELOAD and overwriting
  | certain libc calls. Certainly there are other options with
  | instrumentation (dynamorio? valgrind?).

I have no idea what you are thinking of there, but I don't think it is
related to what I suggested, which certainly could not be implemented
anything like that.    My suggestion was not really all that serious, I
doubt there'd be a lot of benefit, nor many applications that would be
rewritten (which they'd need to be, it would need a whole new API) to
make use of it...   Note what I suggested had nothing to do with measuring
anything, it was an (unlikely) method to allow applications which need
it, to perform better, by reducing the number of system calls they perform,
by allowing them to combine a sequence of sys calls into one operation
(one might imagine setting up a sequence of open, write, close and doing
all that in one step).   Kind of similar to the way posix_spawn() allows
fork/cd/open/dup/close/exec to all be rolled up into a single syscall, but
more general (if you can imagine that).

kre



Home | Main Index | Thread Index | Old Index