Subject: kern/8195: amanda hangs in 'D' state
To: None <gnats-bugs@gnats.netbsd.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: netbsd-bugs
Date: 08/12/1999 06:50:51
>Number:         8195
>Category:       kern
>Synopsis:       amanda hangs in 'D' state
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 12 06:50:00 1999
>Last-Modified:
>Originator:     bouyer@antioche.lip6.fr (Manuel Bouyer)
>Organization:

LIP6, Universite Paris VI

>Release:        NetBSD 1.4.1 'final'
>Environment:
	
1.4.1 (ROBOT) #1: Tue Aug 10 18:15:53 MEST 1999 
bouyer@vlaminck:/usr/src/sys/arch/i386/compile/ROBOT

>Description:
	
	The amanda backup system works the following way:
	several 'dumper' processes collect dumps from clients in parrallel.
	These dumps are written to disk (in /var/amanda/place in my setup).
	Once a dump is done, the taper process transfers it from disk to
	the tape drive.

	The problem I have is that since I upgraded my amanda server from 1.3.3
	to 1.4.1, after several dumps have been written to disk and then
	transfered to tape, the taper process hang in 'D' state:
   98  1316  1315   1  -5  0   168 1172 getblk D    p0    0:16.33 taper 
	
	Any attemps to access the directory where dumps are written
	(a subdir of /var/amanda/place) hangs as well in D state:
    0  1533  1531   0  -2  0   524  892 vnlock D+   p2    0:00.09 -csh (tcsh)
    0  1437   235   0  -2  0   216  128 vnlock D+   p0    0:00.00 ls -l 
	(these are 'ls -l' in /var/amanda/place/19990812, and a
	'ls /var/amanda/place/1<tab>' in tcsh).

	Problem also occurs with a 1.4 and -current kernel.
	2 core dumps (one of 1.4.1, one of 1.4) of a 'reboot -d -q'
	once the system is in this state are available from
	ftp://antioche.lip6.fr/pub/tmp/bouyer/1.4

	For now I'm going to revert back to 1.3.3, as I'm leaving for
	a week and I need backups to be done.
	I can try patches (preferably for 1.4.1) when I'm back.

>How-To-Repeat:
	Run amanda with a holding disk on a 1.4, 1.4.1 or -current system.
>Fix:
	unknow, unfortunably.
>Audit-Trail:
>Unformatted: