tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Possible buffer cache race?



Hi,

I'm doing some testing with nvme(4) with quite deep i/o queues with
-current kernel on a MP system. At this moment basically just
confirming functionality, so just bunch of parallel tar extracting
file within a filesystem to a subdirectory.

I have the filesystem mounted async and the machine has huge amount of
RAM, without logging at the moment. So it's mostly buffer cache
exercise, with i/o spikes on sync.

I see interesting thing - periodically, all of the tar processes get
blocked sleeping on either tstile, biolock or pager_map. All the tar
processes block. When I just wait they stay blocked. When I call
sync(8), all of them unblock and continue running, until again they
all hit the same condition later.  When I keep calling sync,
eventually all processes finish.

Most often they block on biolock, then somewhat less frequently
tstile; pager_map is more rare. It's usually mix of these - most
processes block on biolock, some tstile and zero/one/two on pager_map.

I understand "biolock" is used to wait for busy buf. Looked into code,
nothing really obviously wrong there. Code uses
cv_broadcast(&bp->b_busy) & cv_timedwait(), so there shouldn't be a
race.

Any idea where I should try to start poking?

Jaromir


Home | Main Index | Thread Index | Old Index