tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 8.0 performance issue when running build.sh?



So after more investigation here is an update:

The effect depends highly on "too high" -j args to build.sh. As a stopgap
hack I reduced them on the build cluster and now the netbsd-7 builds are
slow, but not super slow (and all other builds are slightly slower).

I tested building of netbsd-7 on a netbsd-8 build cluster slave a lot
and the ideal -j value is 11 or 12 (only!) for a single build. Builds
for other branches apparently do better with higher -j values, but I have
not done extensive testing there (yet).

With a bit of help (tx everyone involved!) I got lockstat output, and
the single *really* hot lock on 8.0 is kernel_lock, while everything
else looks quite normal. So for kernel_lock we traced the callers, and
got this:

On a build cluster slave running 7.1_STABLE, building netbsd-7 with
"build.sh -m alpha -j 24" and source, obj, dest, release and tooldirs
all on tmpfs, tools already build, we get:


Elapsed time: 1475.74 seconds.

-- Kernel lock spin

Total%  Count   Time/ms          Lock                       Caller
------ ------- --------- ---------------------- ------------------------------
100.00  159874    366.69 kernel_lock            <all>
 41.76   65404    153.12 kernel_lock            VOP_POLL+43
 19.29   36948     70.74 kernel_lock            VOP_LOCK+43
 13.25   20066     48.59 kernel_lock            tcp_send_wrapper+22
  7.45    8042     27.30 kernel_lock            intr_biglock_wrapper+12
  4.94    6220     18.11 kernel_lock            sleepq_block+1c7
  4.88    5654     17.91 kernel_lock            fileassoc_file_delete+1c
  3.49    7681     12.80 kernel_lock            VOP_UNLOCK+40
  2.59    2558      9.50 kernel_lock            bdev_strategy+52
  1.05    5626      3.85 kernel_lock            VOP_WRITE+48
  0.38     904      1.38 kernel_lock            VOP_READ+48
  0.29     254      1.05 kernel_lock            biodone2+6f
  0.23     103      0.84 kernel_lock            frag6_fasttimo+1a
  0.15     217      0.57 kernel_lock            tcp_input+e27
  0.15     102      0.54 kernel_lock            tcp_input+1366
  0.02      17      0.08 kernel_lock            callout_softclock+38d
  0.02       6      0.06 kernel_lock            ip6flow_slowtimo+e
  0.02      13      0.06 kernel_lock            ip_slowtimo+1a
  0.02      15      0.06 kernel_lock            ip6flow_slowtimo_work+1f
  0.02      17      0.06 kernel_lock            frag6_slowtimo+1b
  0.01      16      0.04 kernel_lock            ipflow_slowtimo_work+1f
  0.01      10      0.02 kernel_lock            ipflow_slowtimo+e
  0.00       1      0.01 kernel_lock            udp_send_wrapper+2b


Same setup, on a build cluster slave running 8.0_RC2 we get:

Elapsed time: 3857.33 seconds.

-- Kernel lock spin

Total%  Count   Time/ms          Lock                       Caller
------ ------- --------- ---------------------- ------------------------------
100.00 838202502 25353589.88 kernel_lock            <all>
 86.70 662529722 21982378.01 kernel_lock            sleepq_block+1b5
  6.27 62286987 1590912.88 kernel_lock            VOP_LOCK+b2
  4.05 54443880 1027059.31 kernel_lock            VOP_LOOKUP+5e
  2.38 51325126 603535.23 kernel_lock            VOP_UNLOCK+70
  0.42 5623585 105948.06 kernel_lock            VOP_GETATTR+5e
  0.05  741051  13894.97 kernel_lock            VOP_POLL+93
  0.02  290699   5546.12 kernel_lock            tcp_send_wrapper+22
  0.02   91076   5167.14 kernel_lock            intr_biglock_wrapper+16
  0.02  206972   4178.13 kernel_lock            VOP_INACTIVE+5d
  0.01  133766   3472.19 kernel_lock            scsipi_adapter_request+63
  0.01  161192   3109.39 kernel_lock            fileassoc_file_delete+20
  0.01  101838   1833.00 kernel_lock            VOP_CREATE+62
  0.01   83442   1812.25 kernel_lock            VOP_WRITE+61
  0.01   64171   1342.68 kernel_lock            VOP_READ+61
  0.00   28336    846.55 kernel_lock            softint_dispatch+dc
  0.00   37851    772.09 kernel_lock            VOP_REMOVE+61
  0.00   10945    606.49 kernel_lock            ipintr+37
  0.00    7435    353.17 kernel_lock            biodone2+6d
  0.00   17025    284.61 kernel_lock            VOP_ACCESS+5d
  0.00    6108    172.84 kernel_lock            frag6_fasttimo+1a
  0.00    1200    103.23 kernel_lock            callout_softclock+2b1
  0.00    2789     75.00 kernel_lock            nd6_timer_work+49
  0.00    2426     70.32 kernel_lock            ip_slowtimo+1a
  0.00    2384     57.25 kernel_lock            frag6_slowtimo+1f
  0.00     818     16.57 kernel_lock            VOP_ABORTOP+94
  0.00     567     14.15 kernel_lock            VFS_SYNC+65
  0.00     352     10.43 kernel_lock            kevent1+692
  0.00     428      8.72 kernel_lock            VOP_LINK+61
  0.00     129      4.86 kernel_lock            udp_send_wrapper+2b
  0.00      58      1.73 kernel_lock            kqueue_register+419
  0.00      56      1.46 kernel_lock            knote_detach+135
  0.00       5      0.27 kernel_lock            arpintr+21
  0.00      12      0.25 kernel_lock            kqueue_register+25b
  0.00       8      0.21 kernel_lock            cdev_ioctl+8c
  0.00      23      0.14 kernel_lock            VOP_RMDIR+61
  0.00       2      0.08 kernel_lock            tcp_attach_wrapper+18
  0.00       2      0.06 kernel_lock            tcp_detach_wrapper+16
  0.00      26      0.02 kernel_lock            VOP_MKDIR+65
  0.00      10      0.01 kernel_lock            VOP_READDIR+6d


This is sleepq_block re-aquiring the kernel_lock when the caller did
hold it before going to sleep. I see no easy way to get a list of callers
that do that and times spend sleeping (we may be on a different cpu when
waking up, right?) - so any ideas welcome how to diagnose this further.

Martin


Home | Main Index | Thread Index | Old Index