Subject: Re: 1sec+ delays using msync(2) with flags MS_ASYNC | MS_INVALIDATE
To: None <current-users@netbsd.org>
From: Christos Zoulas <christos@astron.com>
List: current-users
Date: 04/18/2007 18:29:47
In article <20070418162747.GA8606@monolith.usask.ca>,
Brian de Alwis  <bsd@cs.ubc.ca> wrote:
>Summary: a call to msync(2) with flags MS_ASYNC | MS_INVALIDATE
>appears to be done as a synchronous call, and can take >1sec.
>Should it? How can I avoid it?
>
>I'm packaging up crm114 for pkgsrc: crm114 is a powerful text
>classifier that does particularly well for spam filtering, amongst
>other uses.  (It's in wip/crm114; it's marked as broken for now as
>some of its more esoteric classifiers currently fail the tests,
>though it does compile and work with the usual classifiers.)
>
>I'm trying to figure out why causes a long sustained disk write of
>a second or more on each spam classification.  ktrace -R reports
>that almost a second is spent in a call to __msync13():
>
>    0.9941388845 CALL  __msync13(0xbb7df000,0x2dc714,3)
>
>corresponding to a call in the source:
>
>    msync (map->addr, map->actual_len, MS_ASYNC | MS_INVALIDATE);
>
>crm114 does most of its file manipulation by mmaping the files to
>memory and uses the contents as a hash table.  In my case there
>are two spam classification mmaps (sparse files) which are 3,000,084
>bytes each.  So although MS_ASYNC is specified, it appears the sync
>is actually treated as synchronous.  Should it be?  How it be
>avoided? (I'm not sure why the author uses MS_INVALIDATE.)
>
>Compounding this problem is that the spam processing does this for
>each message processed.
>

Please file a PR.

christos