Subject: bin/27179: dump(8) goes into loop, never finishing dump
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <mjl@netbsd.org>
List: netbsd-bugs
Date: 10/07/2004 10:25:42
>Number:         27179
>Category:       bin
>Synopsis:       dump(8) goes into loop, never finishing dump
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Oct 07 08:26:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Martin J. Laubach
>Release:        NetBSD 2.0_RC2
>Organization:
>Environment:
System: NetBSD asparagus.emsi.priv.at 2.0_RC2 NetBSD 2.0_RC2 (ASPARAGUS) #0: Sat Oct 2 00:43:42 CEST 2004 mjl@asparagus.emsi.priv.at:/storage/netbsd/cvs/src/sys20/arch/i386/compile/ASPARAGUS i386
Architecture: i386
Machine: i386
>Description:

  It looks as if dump got in some kind of loop and never
finishes (usually that disk is dumped in an about hour,
I interrupted the dump after 10 hours).

|   DUMP: Found /dev/rld0h on /home in /etc/fstab
|   DUMP: Date of this level 0 dump: Wed Oct  6 03:49:10 2004
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/rld0h (/home) to standard output
|   DUMP: Label: none
|   DUMP: mapping (Pass I) [regular files]
|   DUMP: mapping (Pass II) [directories]
|   DUMP: estimated 14143153 tape blocks.
|   DUMP: Volume 1 started at: Wed Oct  6 03:49:31 2004
|   DUMP: dumping (Pass III) [directories]
|   DUMP: dumping (Pass IV) [regular files]
|   DUMP: 3.68% done, finished in 2:11
|   DUMP: 8.65% done, finished in 1:45
|   DUMP: 12.81% done, finished in 1:42
|   DUMP: 17.47% done, finished in 1:34
|   DUMP: 22.47% done, finished in 1:26
|   DUMP: 27.63% done, finished in 1:18
|   DUMP: 32.90% done, finished in 1:11
|   DUMP: 37.99% done, finished in 1:05
|   DUMP: 43.33% done, finished in 0:58
|   DUMP: 48.59% done, finished in 0:52
|   DUMP: 53.98% done, finished in 0:46
|   DUMP: 59.30% done, finished in 0:41
|   DUMP: 64.35% done, finished in 0:36
|   DUMP: 69.60% done, finished in 0:30
|   DUMP: 75.26% done, finished in 0:24
|   DUMP: 80.26% done, finished in 0:19
|   DUMP: 81.72% done, finished in 0:19
|   DUMP: 81.96% done, finished in 0:19
|   DUMP: 81.96% done, finished in 0:19
|   DUMP: 82.01% done, finished in 0:20
|   DUMP: 82.06% done, finished in 0:21
| ...
|   DUMP: 90.78% done, finished in 1:28
|   DUMP: 90.83% done, finished in 1:28
|   DUMP: 90.87% done, finished in 1:28
|   DUMP: 90.92% done, finished in 1:28
|   DUMP: 90.97% done, finished in 1:28
|   DUMP: 91.01% done, finished in 1:28
|   DUMP: 91.06% done, finished in 1:28
|   DUMP: 91.10% done, finished in 1:28
|   DUMP: 91.15% done, finished in 1:28
|   DUMP: 91.19% done, finished in 1:28
|   DUMP: 91.27% done, finished in 1:28
|   DUMP: 91.31% done, finished in 1:28
|   DUMP: 91.36% done, finished in 1:28
|   DUMP: 91.40% done, finished in 1:28
|   DUMP: 91.45% done, finished in 1:28
|   DUMP: 91.49% done, finished in 1:28
|   DUMP: 91.54% done, finished in 1:28
|   DUMP: 91.58% done, finished in 1:28

  This seems to happen in amanda initiated dumps only,
manual dumps work fine. Also, it seems directly related
to the size of the file system (or perhaps the time it
takes to dump), it never happens on small fs, but is
pretty consistent on large ones like the one above.

  This problem was there in 1.6 too, but something in
my setup made it disappear at some point. Now it's back
it seems.

  Several people have commented they experienced the
same problem:

---
From: Luke Mewburn <lukem@NetBSD.org>
Subject: Re: dump(8) behaviour

[..]

I see it in 2.0G, from amanda dumps.
Manual dumps work fine.

amanda did work for a while in 2.0, but something changed and I haven't
been able to track down what it is.  It's not a PIPE_SOCKETPAIR issue.

---
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>

[..]

Yep. I used to see this pretty often in amanda backup logs on 1.6.x and
1.6A..Z. It has become less frequent with 2.0, but still occurs
occasionally.

        hauke
---
From: Tom Ivar Helbekkmo <tih@eunetnorge.no>

[..]

That's my experience, too.  Furthermore, it never happens on small
file systems; the chance of dump hanging increases with fs size.

-tih

>How-To-Repeat:

  Install amanda and try to dump large file system?

>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted: