Re: kern/53861: regular kernel panic (ffs_blkfree: bad size)

To: kern-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,Dima Veselov <kab00m%labma.ru@localhost>
Subject: Re: kern/53861: regular kernel panic (ffs_blkfree: bad size)
From: David Holland <dholland-bugs%netbsd.org@localhost>
Date: Mon, 14 Jan 2019 06:10:01 +0000 (UTC)

The following reply was made to PR kern/53861; it has been noted by GNATS.

From: David Holland <dholland-bugs%netbsd.org@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: kern/53861: regular kernel panic (ffs_blkfree: bad size)
Date: Mon, 14 Jan 2019 06:09:39 +0000

 On Sun, Jan 13, 2019 at 03:45:01PM +0000, Manuel Bouyer wrote:
  >>  I did and fsck found some errors. More than that - I had to check
  >>  big partition (more than 5Tb) twice, because first run failed 
  >>  overwhelming with "Can't read sectors" error showing negative
  >>  block numbers.
  >>  
  >>  It appears bug should be closed, however I am still very concerned
  >>  about using log filesystem in production.  Isn't that a stuation, 
  >>  which should be prevented by log?
  > 
  > Not in all case; only if the unclean shutdown if from an external
  > cause (e.g.  power loss). I've seen this kind of issue with
  > different log filesystems, and linux sets a flag on disk when it
  > runs into a filesystem inconsistency to force fsck on next reboot.

 Well, properly speaking, a journaled fs should survive any crash,
 including one it caused itself, because the whole point of the journal
 is to make the updates to the disk atomic with respect to file system
 operations.

 However, that only really works if the journaled fs has no journaling
 or recovery bugs.

 It also relies on assumptions about the disk working correctly which
 real hardware sometimes occasionally violates.

 So yes, something had to break somewhere for this to happen; it might
 or might not have been wapbl... unfortunately by the time you notice
 and think to run fsck it's effectively impossible to guess what
 happened originally, and all the more so on a huge volume.

 For production use though the first thing most people worry about is
 that wapbl doesn't recover file data (same as traditional ffs) and
 this can cause various problems after a crash.

 -- 
 David A. Holland
 dholland%netbsd.org@localhost

Prev by Date: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Next by Date: Re: kern/53858 (typo in sys/dev/ic/rtl8169.c: inverted IM_HW logic)
Previous by Thread: Re: kern/53861: regular kernel panic (ffs_blkfree: bad size)
Next by Thread: Re: port-xen/53863: panic: xen_failsafe_handler called! while running 32-bit binary on 64-bit.
Indexes:

Home | Main Index | Thread Index | Old Index