Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: amd64 -current crashs at boot



On Sun, Dec 21, 2008 at 05:55:54PM +0100, Christoph Egger wrote:
> Christoph Egger wrote:
> > Hi,
> > 
> > a amd64 -current kernel from today crashes at boot
> > when sshd starts:
> > 
> > uvm_fault(0xffffffff80d1e180, 0x0, 1) -> e
> > fatal page fault in supervisor mode
> > trap type 6 code 0 rip ffffffff802abbe4 cs 8 rflags 10282 cr2  60 cpl 0
> > rsp ffff80004d832b20
> > kernel: page fault trap, code=0
> > Stopped in pid 0x46 (system) at netbsd:ffs_update+0x24: testb
> > $0x1,0x60(%ray)
> > db{0}> bt
> > ffs_update() at netbsd:ffs_update+0x24
> > ffs_full_fsync() at netbsd:ffs_full_fsync+0x54b
> > spec_fsync() at netbsd:spec_fsync+0x59
> > VOP_FSYNC() at netbsd:VOP_FSYNC+0x71
> > sched_sync() at netbsd:sched_sync+0x14f
> > db{0}> ps /l
> > [...]
> >  PID  LID S FLAGS     STRUCT LWP *         NAME WAIT
> >> 0    49 3   204  ffff80004e1e7400     physiod physiod
> >        48 3   204  ffff80004d7127c0 vmem_rehash vmem_rehash
> >        47 3   204  ffff80004d712ba0    aiodoned aiodoned
> >      > 46 7   204  ffff80004d700000     ioflush
> > [...]
> 
> I found the commit which causes this:
> 
> It is ffs_vnops.c, rev. 1.105. Going back to rev. 1.104 makes
> the machine boot again.
> 
> With rev. 1.105, when ffs_full_fsync() calls ffs_update in line 580,
> vp->v_mount is a NULL pointer. ffs_update() dereferences it w/o
> checking if the pointer is valid.

I seem to hit the other branch - NULL v_specmountpoint rather
than NULL v_mount:
With source of Dec 21 18:40, on i386, hit the assertion in
/sys/ufs/ffs/ffs_vnops.c:414

   413          if ((flags & FSYNC_VFS) != 0) {
   414                  KASSERT(vp->v_specmountpoint != NULL);
   415                  mp = vp->v_specmountpoint;
   416                  ffsino = (mp->mnt_op == &ffs_vfsops);
   417                  KASSERT(vp->v_type == VBLK);
   418          } else {
   419                  mp = vp->v_mount;
   420                  ffsino = true;
   421                  KASSERT(vp->v_tag == VT_UFS);
   422          }

flags = FSYNC_VFS | FSYNC_RECLAIM | FSYNC_WAIT

This seems to be a catch22, as we are trying to mount /var, /var doesn't
have a mountpoint yet...

#5  0xc01ec5f0 in ffs_full_fsync (vp=0xcb4c70c0, flags=517)
    at ../../../../ufs/ffs/ffs_vnops.c:414
#6  0xc01ec6a4 in ffs_fsync (v=0xcb4f69d8)
    at ../../../../ufs/ffs/ffs_vnops.c:301
#7  0xc049c2de in VOP_FSYNC (vp=0xcb4c70c0, cred=0xca3a6f00, flags=5, offlo=0, 
    offhi=0) at ../../../../kern/vnode_if.c:803
#8  0xc0481a14 in vinvalbuf (vp=0xcb4c70c0, flags=<value optimized out>, 
    cred=0xca3a6f00, l=0xcb4c95c0, catch=false, slptimeo=0)
    at ../../../../kern/vfs_subr.c:866
#9  0xc01e9521 in ffs_mountfs (devvp=0xcb4c70c0, mp=0xcb4bf000, l=0xcb4c95c0)
    at ../../../../ufs/ffs/ffs_vfsops.c:935
#10 0xc01eac7f in ffs_mount (mp=0xcb4bf000, path=0xbfbfea78 "/var", 
    data=0xc118ebe0, data_len=0xcb4f6cd0)
    at ../../../../ufs/ffs/ffs_vfsops.c:448
#11 0xc047f794 in VFS_MOUNT (mp=0xcb4bf000, a=0xbfbfea78 "/var", b=0xc118ebe0, 
    c=0xcb4f6cd0) at ../../../../kern/vfs_subr.c:2879
#12 0xc0489952 in do_sys_mount (l=0xcb4c95c0, vfsops=0x0, 
    type=0x8048e6d "ffs", path=0xbfbfea78 "/var", flags=0, data=0xbfbfee7c, 
    data_seg=UIO_USERSPACE, data_len=4, retval=0xcb4f6d28)
    at ../../../../kern/vfs_syscalls.c:364
#13 0xc0489c99 in sys___mount50 (l=0xcb4c95c0, uap=0xcb4f6d00, 
    retval=0xcb4f6d28) at ../../../../kern/vfs_syscalls.c:449

Sure enough, v_specmountpoint is NULL:

(gdb) print *(vp->v_un.vu_specnode->sn_dev)
$9 = {sd_mountpoint = 0x0, sd_lockf = 0x0, sd_bdevvp = 0xcb4c70c0, 
  sd_opencnt = 1, sd_refcnt = 1, sd_rdev = 4869}

where sd_rdev = 4869 is /dev/ld0f.

(beginning of /etc/fstab:
/dev/ld0a / ffs rw 1 1
/dev/ld0b none swap sw 0 0
/dev/ld0e /usr ffs rw,log 1 2
/dev/ld0f /var ffs rw 1 2
...
)


(gdb) print *vp
$5 = {v_uobj = {vmobjlock = {u = {mtxa_owner = 0}}, pgops = 0xc05212e0, 
    memq = {tqh_first = 0x0, tqh_last = 0xcb4c70c8}, uo_npages = 0, 
    uo_refs = 1, rb_tree = {rbt_root = 0x0, rbt_ops = 0xc0521204, 
      rbt_minmax = {0x0, 0x0}}}, v_cv = {cv_opaque = {0x0, 0xcb4c70e8, 
      0xc05599f7}}, v_size = 0, v_writesize = 0, v_iflag = 0, v_vflag = 48, 
  v_uflag = 0, v_numoutput = 0, v_writecount = 0, v_holdcnt = 0, 
  v_synclist_slot = 0, v_mount = 0xcb3eda04, v_op = 0xc111b500, v_freelist = {
    tqe_next = 0x0, tqe_prev = 0xcb421574}, v_freelisthd = 0x0, v_mntvnodes = {
    tqe_next = 0xcb4c7008, tqe_prev = 0xcb4c71ec}, v_cleanblkhd = {
    lh_first = 0x0}, v_dirtyblkhd = {lh_first = 0x0}, v_synclist = {
    tqe_next = 0x0, tqe_prev = 0x0}, v_dnclist = {lh_first = 0x0}, v_nclist = {
    lh_first = 0xcb4cbb40}, v_un = {vu_mountedhere = 0xcb41baa0, 
    vu_socket = 0xcb41baa0, vu_specnode = 0xcb41baa0, 
    vu_fifoinfo = 0xcb41baa0, vu_ractx = 0xcb41baa0}, v_type = VBLK, 
  v_tag = VT_UFS, v_lock = {vl_lock = {rw_owner = 3410793924}, 
    vl_canrecurse = 0, vl_recursecnt = 0}, v_vnlock = 0xcb4c7160, 
  v_data = 0xcb4ccafc, v_klist = {slh_first = 0x0}}
(gdb) print *(vp->v_mount)
$13 = {mnt_list = {cqe_next = 0xc0593008, cqe_prev = 0xc0593008}, 
  mnt_vnodelist = {tqh_first = 0xcb421f1c, tqh_last = 0xcb4d07b0}, 
  mnt_op = 0xc058af00, mnt_vnodecovered = 0x0, mnt_syncer = 0xcb4d07f4, 
  mnt_transinfo = 0xcb41b9c8, mnt_data = 0xc120cb00, mnt_unmounting = {
    rw_owner = 0}, mnt_renamelock = {u = {mtxa_owner = 0}}, mnt_refcnt = 165, 
  mnt_recursecnt = 0, mnt_flag = 20480, mnt_iflag = 448, mnt_fs_bshift = 13, 
  mnt_dev_bshift = 9, mnt_stat = {f_flag = 0, f_bsize = 8192, f_frsize = 1024, 
    f_iosize = 8192, f_blocks = 508239, f_bfree = 457434, f_bavail = 432023, 
    f_bresvd = 25411, f_files = 126718, f_ffree = 123208, f_favail = 123208, 
    f_fresvd = 0, f_syncreads = 390, f_syncwrites = 2, f_asyncreads = 0, 
    f_asyncwrites = 0, f_fsidx = {__fsid_val = {4864, 1931}}, f_fsid = 4864, 
    f_namemax = 255, f_owner = 0, f_spare = {0, 0, 0, 0}, 
    f_fstypename = "ffs", '\0' <repeats 28 times>, 
    f_mntonname = "/", '\0' <repeats 1022 times>, 
    f_mntfromname = "/dev/ld0a", '\0' <repeats 1014 times>}, 
  mnt_specdataref = {specdataref_container = 0x0, specdataref_lock = {u = {
        mtxa_owner = 0}}}, mnt_updating = {u = {mtxa_owner = 0}}, 
  mnt_wapbl_op = 0x0, mnt_wapbl = 0x0, mnt_wapbl_replay = 0x0}

     $NetBSD: ffs_alloc.c,v 1.119 2008/12/06 20:05:55 joerg Exp $
     $NetBSD: ffs_balloc.c,v 1.51 2008/07/31 05:38:06 simonb Exp $
     $NetBSD: ffs_inode.c,v 1.100 2008/12/17 20:51:38 cegger Exp $
     $NetBSD: ffs_snapshot.c,v 1.89 2008/12/19 11:36:10 hannken Exp $
     $NetBSD: ffs_softdep.stub.c,v 1.23 2008/05/31 21:37:08 ad Exp $
     $NetBSD: ffs_subr.c,v 1.45 2008/06/03 09:47:49 hannken Exp $
     $NetBSD: ffs_tables.c,v 1.9 2005/12/11 12:25:25 christos Exp $
     $NetBSD: ffs_vfsops.c,v 1.241 2008/11/13 11:09:45 ad Exp $
     $NetBSD: ffs_vnops.c,v 1.105 2008/12/21 10:44:32 ad Exp $

I can boot with

     $NetBSD: ffs_alloc.c,v 1.114 2008/11/06 22:31:08 joerg Exp $
     $NetBSD: ffs_balloc.c,v 1.51 2008/07/31 05:38:06 simonb Exp $
     $NetBSD: ffs_inode.c,v 1.99 2008/08/30 08:25:53 hannken Exp $
     $NetBSD: ffs_snapshot.c,v 1.82 2008/10/23 17:16:24 hannken Exp $
     $NetBSD: ffs_softdep.c,v 1.115 2008/06/03 09:47:49 hannken Exp $
     $NetBSD: ffs_subr.c,v 1.45 2008/06/03 09:47:49 hannken Exp $
     $NetBSD: ffs_tables.c,v 1.9 2005/12/11 12:25:25 christos Exp $
     $NetBSD: ffs_vfsops.c,v 1.241 2008/11/13 11:09:45 ad Exp $
     $NetBSD: ffs_vnops.c,v 1.104 2008/10/10 09:21:58 hannken Exp $

(which happens to have SOFTDEP and not WAPBL, but as / and /var aren't
mounted with either, that shouldn't matter. Maybe ld0e gets mounted before
ld0f? and ld0e uses logging)
(and ffs_vnops.c 1.104 OK seems right)

Thoughts?

Cheers,

Patrick


Home | Main Index | Thread Index | Old Index