NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls out of swap



The following reply was made to PR kern/45708; it has been noted by GNATS.

From: David Holland <dholland-bugs%netbsd.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 out of swap
Date: Sun, 22 Apr 2012 17:52:43 +0000

 (not filed in gnats; this tends to happen if you reply to your own
 gnats mail)
 
    ------
 
 From: Bartosz Ku?ma <bartosz.kuzma%gmail.com@localhost>
 To: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, 
netbsd-bugs%netbsd.org@localhost,
        bartosz.kuzma%gmail.com@localhost
 Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
        out of swap
 Date: Wed, 14 Dec 2011 14:35:02 +0100
 
 On Wed, Dec 14, 2011 at 13:45, Bartosz Ku?ma 
<bartosz.kuzma%gmail.com@localhost> wrote:
 > The following reply was made to PR kern/45708; it has been noted by GNATS.
 >
 > From: =?UTF-8?Q?Bartosz_Ku=C5=BAma?= <bartosz.kuzma%gmail.com@localhost>
 > To: gnats-bugs%netbsd.org@localhost
 > Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, 
 > netbsd-bugs%netbsd.org@localhost
 > Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB), ls
 > ?out of swap
 > Date: Wed, 14 Dec 2011 13:44:32 +0100
 >
 > ?On Wed, Dec 14, 2011 at 13:40, David Holland 
 > <dholland-bugs%netbsd.org@localhost> wro=
 > ?te:
 > ?> The following reply was made to PR kern/45708; it has been noted by GNATS=
 > ?.
 > ?>
 > ?> From: David Holland <dholland-bugs%netbsd.org@localhost>
 > ?> To: gnats-bugs%NetBSD.org@localhost
 > ?> Cc:
 > ?> Subject: Re: kern/45708: Unable to read big files from large FFSv2 (12TB)=
 > ?, ls
 > ?> =C2=A0out of swap
 > ?> Date: Wed, 14 Dec 2011 12:35:56 +0000
 > ?>
 > ?> =C2=A0On Tue, Dec 13, 2011 at 09:10:01AM +0000, 
 > bartosz.kuzma%gmail.com@localhost w=
 > ?rote:
 > ?> =C2=A0> On large filesystem (12TB) when I try to create big files I'm
 > ?> =C2=A0> unable to ls directory.
 > ?> =C2=A0>
 > ?> =C2=A0> When I try to do:
 > ?> =C2=A0>
 > ?> =C2=A0> # ls -1 /mnt
 > ?> =C2=A0>
 > ?> =C2=A0> Kernel panic with the following message:
 > ?> =C2=A0>
 > ?> =C2=A0> UVM: pid 977 (ls), uid 0 killed: out of swap
 > ?> =C2=A0> ubc_uiomove: error=3D12
 > ?> =C2=A0> dev =3D 0xa800, block =3D 1305922608, fs =3D /mnt
 > ?>
 > ?> =C2=A0That is weird...
 > ?>
 > ?> =C2=A0> panic: blkfree: freeing free block
 > ?>
 > ?> =C2=A0...but this makes me think the real problem is that the filesystem =
 > ?is
 > ?> =C2=A0corrupted. Have you run fsck on it recently? Does this really happe=
 > ?n
 > ?> =C2=A0on a freshly newfs'd volume as described?
 > ?>
 > ?> =C2=A0--
 > ?> =C2=A0David A. Holland
 > ?> =C2=A0dholland%netbsd.org@localhost
 > ?>
 >
 > ?Yes, it is easily reproductible on freshly newfs'd volume.
 >
 > ?When I did test with creating several large files (about 256GB each)
 > ?and then call sync command and did unclean reboot (e. g. poweroff) it
 > ?is unable to mount this fs again. It hangs on "replying log to disk".
 > ?However it is possible to mount it in read-only mode. It simply put
 > ?"replying log to memory" and works.
 >
 > ?If you need more info or even access to this machine ask me.
 >
 > ?--=20
 > ?Pozdrawiam, Bartosz Ku=C5=BAma.
 >
 
 There is simpler way to reproduce error:
 
  # newfs -O 2 /dev/dk0
  # mount -o log /dev/dk0 /mnt
 
  And run the following script:
 
  #!/bin/sh
 
  for i in `jot 256 1 256`
  do
         echo mkdir /mnt/dir-${i}
         mkdir /mnt/dir-${i}
 
         for j in `jot 256 1 256`
         do
                 echo touch /mnt/dir-${i}/file-${j}
                 touch /mnt/dir-${i}/file-${j}
         done
  done
 
 
  And about line "touch /mnt/dir-28/file-122" kernel panics:
 
  dev = 0xa800, block = 625305256, fs = /mnt
  panic: blkfree: freeing free frag
  fatal breakpoint trap in supervisor mode
  trap type 1 code 0 rip ffffffff8052ace5 cs 8 rflags 246 cr2  0 cpl 0
  rsp ffff80005175f850
  Stopped in pid 0.58 (system) at netbsd:breakpoint+0x5:  leave
  db{1}> trace
  breakpoint() at netbsd:breakpoint+0x5
  panic() at netbsd:panic+0x24d
  ffs_blkfree() at netbsd:ffs_blkfree+0x6d7
  ffs_wapbl_sync_metadata() at netbsd:ffs_wapbl_sync_metadata+0x66
  wapbl_flush() at netbsd:wapbl_flush+0x7c
  ffs_sync() at netbsd:ffs_sync+0x36c
  VFS_SYNC() at netbsd:VFS_SYNC+0x33
  sync_fsync() at netbsd:sync_fsync+0x85
  VOP_FSYNC() at netbsd:VOP_FSYNC+0x71
  sched_sync() at netbsd:sched_sync+0x15d
 
  db{1}> ps
  PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
  16157>   1 7   3         4   ffff8000520b5020              touch
  1049     1 3   3        84   ffff8000524b2000                 sh wait
  911      1 3   0        84   ffff800052682800                ksh ttyraw
  403      1 3   3        84   ffff8000524b23e0                ksh pause
  375      1 3   3        84   ffff8000524be7e0                 su wait
  300      1 3   3        84   ffff800052682be0                ksh pause
  405      1 3   0        84   ffff8000524be020               sshd select
  398      1 3   0        84   ffff80004ca9b000               sshd netio
  393      1 3   0        84   ffff80004ca9b3e0              login wait
  383      1 3   0        84   ffff8000524bebc0               cron nanoslp
  380      1 3   3        84   ffff8000524be400              inetd kqueue
  379      1 3   2        84   ffff8000524b27c0               qmgr kqueue
  388      1 3   0        84   ffff8000520e7800             pickup kqueue
  365      1 3   0        84   ffff8000520b57e0             master kqueue
  263      1 3   0        84   ffff8000520e7420               sshd select
  126      1 3   0        84   ffff8000520b5bc0            syslogd kqueue
  1        1 3   0        84   ffff80004ca8a420               init wait
  0       60 3   0       204   ffff8000520b5400            physiod physiod
               59 3   1       204   ffff80004ca9b7c0           aiodoned aiodoned
            >  58 7   1       204   ffff80004ca9bba0            ioflush
               57 3   1       204   ffff80004ca857c0           pgdaemon pgdaemon
               56 3   3       204   ffff80004ca84800          cryptoret 
crypto_wa
 
  db{1}> trace/t 0x3f1d
  trace: pid 16157 lid 1 at 0xffff8000520d2b50
  0:
 
 -- 
 Pozdrawiam, Bartosz Ku?ma.
 


Home | Main Index | Thread Index | Old Index