Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/sys/kern



> On 27. Sep 2018, at 09:44, Manuel Bouyer <bouyer%antioche.eu.org@localhost> wrote:
> 
> On Thu, Sep 27, 2018 at 01:03:40AM +0000, Emmanuel Dreyfus wrote:
>> Module Name:	src
>> Committed By:	manu
>> Date:		Thu Sep 27 01:03:40 UTC 2018
>> 
>> Modified Files:
>> 	src/sys/kern: vfs_trans.c
>> 
>> Log Message:
>> Work around deadlock between fstchg and fstcnt
>> 
>> When suspending a filesystem in fstrans_setstate(), we wait on
>> fstcnt for threads to finish transactions. While we do this, any
>> thread trying to start a filesystem transaction will wait on fstchg
>> in fstrans_start(), a situation which can deadlock.
>> 
>> The wait for fstcnt in fstrans_setstate() can be interrupted by
>> a signal, but the wait for fstchg in fstrans_start() cannot. Once
>> most processes are stuck in fstchg, it is impossible to send a
>> signal to the thread that waits on fstcnt, because no process
>> respond anymore to user input.
>> 
>> We fix that by adding a timeout to the wait on fstcnt in
>> fstrans_setstate(). This means suspending a filesystem may fail,
>> but it was already the case when the sleep was interupted by
>> a signal, hence calling function must already handle a possible
>> failure.
> 
> Actually callers do not, they just forward the failure.
> This means that things like e.g. umount or snapshots will randomly
> fail (whenever fstrans_setstate() can actually take a long
> time without deadlock has to be determined, but I suspect it can).
> 
> I think fstrans_setstate() should pause and retry in this case.

As Manuel said ...

This change is wrong, there is no upper time limit for a successfull
file system suspension.

Please revert ASAP.

I will take a look into this PR next week.

--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig

Attachment: signature.asc
Description: Message signed with OpenPGP



Home | Main Index | Thread Index | Old Index