Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/sys



> On 07 Nov 2016, at 21:53, Jaromír Doleček <jaromir.dolecek%gmail.com@localhost> wrote:
> 
> 2016-11-07 12:11 GMT+01:00 J. Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost>:
>> It gets used when a block of block pointers has been deallocated and
>> brelsed, that is NOT written to disk.  If we allow the deallocation of
>> this block to fail and ufs_truncate_retry() runs ffs_truncate again
>> it will read the block from disk and free already freed blocks.
> 
> Good point. The same is however true for ALL the calls in the chain,
> so then we should have to really use FORCE everywhere.

No.  If one of the other calls (clearing one pointer in a block) fails
we already bwrite() (if (error != 0 ...).  In the wapbl case writing
all blocks would result in transitions too big for the log.  I tried it ...

> One solution is to always bwrite() or bdwrite() back even fully zeroed
> block for wapbl case instead of brelse(BC_INVAL). I think that for
> fsck to reliably recover from crash within truncate, this might
> actually be needed also for !wapbl case.

No.  The !wapbl case is safe.  The block is either written after
the copy (before we change it) or from ffs_truncate() when it
sets "sync = 1".  For wapbl case see above.

> Another way how to solve this would be to try to register the
> deallocation and on error bail out, before the diving. It would
> require cancelling the registration if the diving call returns EAGAIN
> however.
> 
>> You mean this:
>> 
>>        error = ffs_indirtrunc(ip, nlbn, FFS_FSBTODB(fs, nb),
>>            (daddr_t)-1, level - 1, countp);
>> -       if (error)
>> -               goto out;
>> +       if (error == EAGAIN)
>> +               goto out;
>> +       else if (error && !allerror)
>> +               allerror = error;
> 
> No, I mean the copy logic and big blocks with condition on
> ip->i_ump->um_mountp->mnt_wapbl.

The copy logic is here to prevent corruption beyond fsck capabilities
and therefore we need in the !wapbl case.  There IS a reason it
was here since the beginning of time.

Not sure if it would make sense to split it into ffs_indir_trunc()
and ffs_indir_trunc_wapbl().  Possibly easier to understand but
prone to errors fixed in only one of them.

>> I don't understand ... do you want to split into two diffs?
> 
> I prefer to split commits into incremental changes if reasonably
> possible, so it's easier to review and revise. That's all. That's why
> I prefer the fix for immediate corruption to go separately.

My patch contains corruption issues only and it passes ATF and
it passes my stress test which is a bit more than just some fsx.

As -current currently corrupts file systems we should either fix it very
soon or revert your changes completely.

--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig (Germany)



Home | Main Index | Thread Index | Old Index