Re: fsync error reporting

To: David Holland <dholland-tech%netbsd.org@localhost>
Subject: Re: fsync error reporting
From: Greg Troxel <gdt%lexort.com@localhost>
Date: Fri, 19 Feb 2021 08:33:03 -0500

David Holland <dholland-tech%netbsd.org@localhost> writes:

>  > > everything that process wrote is on disk,
>  > 
>  > That is probably unattainable, since I've seen it plausibly asserted
>  > that some disks lie, reporting that writes are on the media when this
>  > is not actually true.
>
> Indeed. What I meant to say is that everything has been sent to disk,
> as opposed to being accidentally skipped in the cache because the
> buffer was busy, which will currently happen on some of the fsync
> paths.
>
> That's why flushing the disk-level caches was a separate point.

(ignoring errors as I have no objection to what you proposed and
clarified with mouse@)

Maybe I'm way off in space, but I'd like to see us be careful about

  1) operating system has a succcessful return from a write transaction to
  a disk controller (perhaps via a controller that has a write-back
  cache)

  2) operating system has been told by the controller that the write has
  actually completed to stable storage (guaranteed even if OS crashes or
  power fails, so actually written or perhaps in battery-backed cache)

  A) for stacked filesystems like raid, cgd, and for things like NFS,
  there's basically and e2e ack of the above condition.

POSIX is of course weasely about this.  But it seems obvious that if you
call fsync, you want the property that if there is a crash or power
failure (but not a disk media failure :-) that your bits are there,
which is case 2.  Case 1 is only useful in that files could remain in OS
cache for a long time, and there is a pretty good but not guaranteed
notion that once in device writeback cache they will get to the actual
media in not that long.  The old "sync;sync;sync;sleep 10" thing from
before there was shutdown(8)...

I thought NCQ was supposed to give acks for actual writing, but allow
them to be perhaps ordered and multiple in flight, so that one could use
that instead of the big-hammer inscrutable writeback cache.

If the controller doesn't support NCQ, then it seems one has to issue a
cache flush, which presmably is defined to get all data in cache as of
the flush onto disk before reprorting that its done.

Is that what you're thinking, or do you think this is all about case 1?

Attachment: signature.asc
Description: PGP signature

Follow-Ups:
- Re: fsync error reporting
  - From: David Holland
- Re: fsync error reporting
  - From: Jason Thorpe
- Re: fsync error reporting
  - From: Greg Troxel

References:
- fsync error reporting
  - From: David Holland
- Re: fsync error reporting
  - From: Mouse
- Re: fsync error reporting
  - From: David Holland

Prev by Date: Re: fsync error reporting
Next by Date: Re: fsync error reporting
Previous by Thread: Re: fsync error reporting
Next by Thread: Re: fsync error reporting
Indexes:

Home | Main Index | Thread Index | Old Index