tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

page busy vs. glock



Simon reported a problem where he's writing to a file (log of build -j4)
and repeatedly doing tail -5000.  He says that this causes hangs sooner
or later.

Upon examination, it seems that writing to the log goes through
ufs_balloc_range(), which busies the new pages it wants to enlarge the
file to and then takes the genfs node lock to do actual block allocation.

Meanwhile, tail coming in through mmap + genfs_getpages and holding the
genfs lock tries to uvn_findpages.
  ==> deadlock between PG_BUSY and glock

This can happen because the file is already large enough to contain
partially valid data in the "new" page that ufs_balloc_range() is trying
to enlarge to i.e. old file size is not a multiple of page size.

The two ideas I've had to solve this are:
  1) break hold-and-wait
  2) do not busy already partially valid pages in ufs_balloc_range().
     instead, mark them PG_HOLY (PG_PAGER1) or something like that to
     prevent flush and re-read while we are allocating.

I don't particularly like either one.  Does anyone have better
suggestions?


Home | Main Index | Thread Index | Old Index