David Holland <dholland-tech%netbsd.org@localhost> writes: > > > everything that process wrote is on disk, > > > > That is probably unattainable, since I've seen it plausibly asserted > > that some disks lie, reporting that writes are on the media when this > > is not actually true. > > Indeed. What I meant to say is that everything has been sent to disk, > as opposed to being accidentally skipped in the cache because the > buffer was busy, which will currently happen on some of the fsync > paths. > > That's why flushing the disk-level caches was a separate point. (ignoring errors as I have no objection to what you proposed and clarified with mouse@) Maybe I'm way off in space, but I'd like to see us be careful about 1) operating system has a succcessful return from a write transaction to a disk controller (perhaps via a controller that has a write-back cache) 2) operating system has been told by the controller that the write has actually completed to stable storage (guaranteed even if OS crashes or power fails, so actually written or perhaps in battery-backed cache) A) for stacked filesystems like raid, cgd, and for things like NFS, there's basically and e2e ack of the above condition. POSIX is of course weasely about this. But it seems obvious that if you call fsync, you want the property that if there is a crash or power failure (but not a disk media failure :-) that your bits are there, which is case 2. Case 1 is only useful in that files could remain in OS cache for a long time, and there is a pretty good but not guaranteed notion that once in device writeback cache they will get to the actual media in not that long. The old "sync;sync;sync;sleep 10" thing from before there was shutdown(8)... I thought NCQ was supposed to give acks for actual writing, but allow them to be perhaps ordered and multiple in flight, so that one could use that instead of the big-hammer inscrutable writeback cache. If the controller doesn't support NCQ, then it seems one has to issue a cache flush, which presmably is defined to get all data in cache as of the flush onto disk before reprorting that its done. Is that what you're thinking, or do you think this is all about case 1?
Description: PGP signature