pkgsrc-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PATCH] net/samba4: relocate Sysvol to persist between reboots & move variable data out of /usr/pkg/etc/...



On Thu, 30 Jul 2020 at 02:24, Chuck Silvers <chuq%chuq.com@localhost> wrote:
>
> On Wed, Jul 29, 2020 at 06:13:03PM +0100, Chavdar Ivanov wrote:
> > On Wed, 29 Jul 2020 at 08:33, Matthias Petermann <mp%petermann-it.de@localhost> wrote:
> > >
> > > Hello Chavdar,
> > >
> > > Am 28.07.2020 um 18:48 schrieb Chavdar Ivanov:
> > > > This being a place people are trying samba4 as a DC, I got a
> > > > repeatable panic on one of the systems I am trying it on, as follows:
> > > > ....
> > > > crash: _kvm_kvatop(0)
> > > > Crash version 9.99.69, image version 9.99.69.
> > > > Kernel compiled without options LOCKDEBUG.
> > > > System panicked: /: bad dir ino 657889 at offset 0: Bad dir (not
> > > > rounded), reclen=0x2e33, namlen=51, dirsiz=60 <= reclen=11827 <=
> > > > maxsize=512, flags=0x2005900, entryoffsetinblock=0, dirblksiz=512
> > > >
> > > > Backtrace from time of crash is available.
> > > > _KERNEL_OPT_NARCNET() at 0
> > > > _KERNEL_OPT_DDB_HISTORY_SIZE() at _KERNEL_OPT_DDB_HISTORY_SIZE
> > > > sys_reboot() at sys_reboot
> > > > vpanic() at vpanic+0x15b
> > > > snprintf() at snprintf
> > > > ufs_lookup() at ufs_lookup+0x518
> > > > VOP_LOOKUP() at VOP_LOOKUP+0x42
> > > > lookup_once() at lookup_once+0x1a1
> > > > namei_tryemulroot() at namei_tryemulroot+0xacf
> > > > namei() at namei+0x29
> > > > vn_open() at vn_open+0x9a
> > > > do_open() at do_open+0x112
> > > > do_sys_openat() at do_sys_openat+0x72
> > > > sys_open() at sys_open+0x24
> > > > syscall() at syscall+0x26e
> > > > --- syscall (number 5) ---
> > > > syscall+0x26e:
> > > > ....
> > >
> > >
> > > that still looks like a file system inconsistency. Before the patch from
> > > Chuck I also had the case several times that a filesystem that was
> > > apparently repaired with fsck could no longer be trusted. After
> > > importing the patched kernel, to be on the safe side, I recreated all
> > > the file systems previously mounted with posix1eacls with newfs.
> >
> > Hard that one, as it was the root file system... Anyway, a couple of
> > fsck's seem to have sorted out this one.
>
> how exactly did you run fsck to fix this?  the most reliable way is to boot
> the machine single-user, then run "fsck -fy ..." from the console shell,
> then run the same fsck command again to make sure that it says that
> everything is ok, then reboot.

Exactly. Deep buried in my fingertip's memories is that one should
fsck / in single user, twice, and reboot immediately without a 'sync'.
Perhaps some old SunOS manual...

>
> if you have done that and are still crashing due to corruption in your
> root file system, then we still have another bug in the kernel somewhere.

So it seems to me; the peculiarities here are that in both cases / is
a GPT slice and that I have 'log' as a mount option; it was suggested
'posix1eacls' should be used on its own.

>
>
> > > Presumably fsck is not prepared for the kind of inconsistency, and only
> > > a newfs can restore a trustworthy initial state. What is the starting
> > > point for you? Has the file system been created after the patch, or has
> > > it only been treated with fsck so far?
> >
> > I think it may have been created before the patch to the filesystem
> > code, but before the second version of the samba4 package.
> >
> > >
> > > In any case, I would advise you - if you have not already done so - to
> > > use a separate partition or LVM volume for the sysvol with its own file
> > > system, and to mount only this with the posix1eacls option. It seems the
> > > ACL code still needs a lot of testingh, so at least you can be sure that
> > > your root filesystem will not be affected.
> >
> > As this was running on a XCP-NG guest, I added a small 1GB disk to the
> > vm, created the filesystem (-O 2) and mounted it on /var/db/samba4.
> >
> > I removed the 'posix1eacls' options from the other existing
> > filesystems and left it only for the one mounted on /var/db/samba4 .
> > In this case, the provisioning fails with a message that the
> > filesystem does not support acls - so it perhaps checks  the root
> > filesystem after all. I then re-added this option to /, newfs'd
> > /var/db/samba4, rebooted and retried the provisioning. This resulted
> > in a similar to the above panic, this time after perhaps 10 minutes
> > work of python8 doing database conversion from v1 to v2 - the third
> > database in the list. As this was seen on the console of the XCP-NG
> > guest, I took screenshots of the panic, in case someone is interested.
>
> yes, I'd like to see the screenshots please.

I'll mail them off-list, to avoid large-ish uuencoded bits polluting
the archives.

>
> -Chuck

Chavdar



-- 
----


Home | Main Index | Thread Index | Old Index