tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

very strange and severe raidframe performance issue



hi folks.


i have a system where raidframe accesses are severely restricted
and present a 2-20x performance loss.

after various tests i have confirmed the problem is in raidframe
itself and that appears to be a wakeup() that does not cause the
related ltsleep() to wakeup until the next tick.

initially i observed this system building raid1 parity on 2 disks
limited to about 6MiB/sec, on disks capable of doing well over
100MiB/sec transfers.  systat initially pinpointed raidframe as
the problem due to vastly different "busy" values for the raid
volume compared to the underlying component.

when setting up this system originally, put 3 RAID1 devices on
wd0/wd1 for root, swap and home, and after observing the root
volume init being very slow, i fired off raidctl -iv for the
other 2 raid devices.  normally, i'd have expected an adverse
effect on the total disk io for wd0/wd1, but infact, i saw the
total IO for these disks raise fom ~6MiB/sec to ~17MiB/sec.  ie,
i got almost 3x the performance by having 3 raid devices active.

after a bunch of testing and asking other developers for input
i came to the conclusion that a wakeup() in rf is not actually
waking the sleeper until the next tick occurs.  i have tested
with HZ=100, 512 and 1024 and the total IO/s i see is limited
by the HZ value.  at HZ=1024 the system seems mostly reasonable
but measurements show it still being over 2x slower in some
cases compared to the raw disk.

i converted the main iodone to use mutex/condvar but it did
not change anything.  i added some instrumentation into most of
the iodone usage.  what i've found confirmed my theory that the
wakeups weren't occuring (now with cv_signal()).  it seems that
while there are a bunch of threads woken up and run to schedule
a raid IO, the main slowness comes from the KernelWakeupFunc(),
which is called from biodone2().

the other very interesting thing is that the problem entirely
goes away if i "boot -1" -- ie, avoid SMP.

anyone have any idea what is going on here?


thank.


.mrg.

ps. i converted one more wakeup/tsleep to mutex/cv, but there's
still a fair bit to go:

        http://www.netbsd.org/~mrg/rf_node_mutex.diff

if anyone would like to give this review.


Home | Main Index | Thread Index | Old Index