tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: rsync very slow with current kernel (select issue?)



On Wed, Jul 27, 2011 at 06:32:33PM +0200, Manuel Bouyer wrote:
> Hello,
> I'm testing a current amd64 kernel:
> NetBSD borneo 5.99.55 NetBSD 5.99.55 (GENERIC) #0: Tue Jul 26 23:38:21 UTC 
> 2011  
> builds%b7.netbsd.org@localhost:/home/builds/ab/HEAD/amd64/201107262140Z-obj/home/builds/ab/HEAD/src/sys/arch/amd64/compile/GENERIC
>  amd64
> 
> with a 5.0_STABLE userland, and I noticed a rsync client is really slow
> (rsync -avH --delete --delete-excluded --delete-after --delay-updates --force 
> --stats --partial rsync://rsync.fr.netbsd.org/NetBSD/NetBSD-release-4-0/src .
> or
> rsync -avH --delete --delete-excluded --delete-after --delay-updates --force 
> --stats --partial rsync://rsync.fr.netbsd.org/NetBSD/NetBSD-release-4-0/src).
> 
> Some investigations makes me suspect that select(2) is not working
> properly, especially it doesn't wake up when there's data ready in the
> socket buffer: when the rsync process is idle it's waiting on select,
> when it's idle netstat shows that the receive socket queue is full (I
> tried with both net.inet.tcp.recvbuf_auto set to 1 and 0) and
> ktrace shows:
>    4102      1 rsync    1311780519.324063908 CALL  
> select(4,0x7f7fffff83b0,0x7f7fffff8390,0,0x7f7fffff83d0)
>    4102      1 rsync    1311780579.483436279 RET   select 0
>    4102      1 rsync    1311780579.483440327 CALL  
> select(4,0x7f7fffff83b0,0x7f7fffff8390,0,0x7f7fffff83d0) 
>    4102      1 rsync    1311780579.483442445 RET   select 1
>    4102      1 rsync    1311780579.483443326 CALL  
> read(3,0x7f7ff7a36de2,0x21a)
>    4102      1 rsync    1311780579.483451341 GIO   fd 3 read 538 bytes
> 
> So select blocks (maybe because there's effectively nothing to read at this
> time), but instead of waking up when there's data ready it wakes up
> when the timeout expires. The next select call returns immediatly.
> Does it ring a bell to someone ? Any recent change in this area
> (either in select(2), or tcp) recently ?

Disabling DIRECT_SELECT (with #define NO_DIRECT_SELECT in sys_select.c)
"fixes" the problem. I opened PR kern/45187 for this.
I don't know what's wrong with DIRECT_SELECT at this time, or even if it's
just a timing change which makes select behaves as expected.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index