Subject: Re: Very slow pipe/TCP connection in 1.6_BETA4
To: Kevin Lahey <kml@mobiquitous.net>
From: Andreas Wrede <andreas@planix.com>
List: current-users
Date: 07/09/2002 17:03:04
On Tue, 9 Jul 2002, Kevin Lahey wrote:

> On Mon, 8 Jul 2002 11:02:41 -0400 (EDT)
> Andreas Wrede <andreas@planix.com> wrote:
>
> > I am running a Amanda amrecover for a single file on a i386/1.6_BETA4
> > system and the througput through /dev/nrst0 >> amrestore >pipe> gzip
> > -d >TCP> /sbin/restore manages less than 10Kbytes/sec.
> >
> > A tcpdump shows that the ACK is sent 0.2 seconds after the packet it
> > acknowledges:
> >
> > 10:08:06.805509 localhost.amidxtape > localhost.748: P 81920:98304(16384) ack
> > 1 win 16384 <nop,nop,timestamp 99618 1> 10:08:07.005142 localhost.748 >
> > localhost.amidxtape: . ack 98304 win 65535 <nop,nop,timestamp 99619 99618>
> > 10:08:12.005763 localhost.amidxtape > localhost.748: P 98304:114688(16384) ack
> >[...]
> > Does anyone have an idea what is holding things up here?
>
> Yup, this is a problem with delayed ACKs.  TCP tries to avoid sending
> an ACK for every packet, so it waits 0.2 seconds to see if another packet
> will be coming in.  Usually this is a win, 'cause lots of packets are
> flowing.  In this case, the problem is that the window size is (probably)
> only 16KB, and after the one packet is sent, nothing more can be done
> until the packet is ACKed, freeing up the window.
>
> You could probably work around this by setting the window size to
> greater than 16KB -- I'd go for 128KB at least:
>
> 	# sysctl -w net.inet.tcp.sendspace=131072
> 	# sysctl -w net.inet.tcp.recvspace=131072

I tried increasing the window size - no change. I still see the 0.2
second delay of the ack.
>
> I have to admit I was a little curious about why you were doing TCP
> between processes on the same host -- you probably wouldn't have had
> this problem if you were sending to another host over the net.
> (I don't know anything about amrestore (or restore, for that matter)
> so I apologize if I'm missing something obvious...

Amanda is a distributed backup system. For a restore, the client, the
index server and the tape server can all be different machines. It
just so happens that in this case, the machine I am restoring to also
has the tape drive.

> Another way to fix it would be to come up with a way to ensure that
> gzip writes in blocks of about 8KB.

The tcp connection is written to by gzip, which according to ktrace
does write in 32kbyte blocks. However, I see each 32k write broken up
into a number of 4088 byte 'GIO's, I am not sure what that means:

  6380 gzip     0.198951 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 4088 bytes
  6380 gzip     0.000000 GIO   fd 1 wrote 64 bytes
  6380 gzip     0.001002 RET   write 32768/0x8000


> Hope this helps, and I'm sorry I didn't answer in time to save you
> the trouble of redoing the job,

No problem. I would like to resolve this - the next time I might not
have a big enough partition to dump the tape to...

> Kevin
> kml@mobiquitous.net

-- 
    - aew