Re: rsync very slow with current kernel (select issue?)

On 07/27/11 18:32, Manuel Bouyer wrote:
> Hello,
> I'm testing a current amd64 kernel:
> NetBSD borneo 5.99.55 NetBSD 5.99.55 (GENERIC) #0: Tue Jul 26 23:38:21 UTC 
> 2011  
>  amd64
> with a 5.0_STABLE userland, and I noticed a rsync client is really slow
> (rsync -avH --delete --delete-excluded --delete-after --delay-updates --force 
> --stats --partial rsync:// .
> or
> rsync -avH --delete --delete-excluded --delete-after --delay-updates --force 
> --stats --partial rsync://
> Some investigations makes me suspect that select(2) is not working
> properly, especially it doesn't wake up when there's data ready in the
> socket buffer: when the rsync process is idle it's waiting on select,
> when it's idle netstat shows that the receive socket queue is full (I
> tried with both net.inet.tcp.recvbuf_auto set to 1 and 0) and
> ktrace shows:
>    4102      1 rsync    1311780519.324063908 CALL  
> select(4,0x7f7fffff83b0,0x7f7fffff8390,0,0x7f7fffff83d0)
>    4102      1 rsync    1311780579.483436279 RET   select 0
>    4102      1 rsync    1311780579.483440327 CALL  
> select(4,0x7f7fffff83b0,0x7f7fffff8390,0,0x7f7fffff83d0) 
>    4102      1 rsync    1311780579.483442445 RET   select 1
>    4102      1 rsync    1311780579.483443326 CALL  
> read(3,0x7f7ff7a36de2,0x21a)
>    4102      1 rsync    1311780579.483451341 GIO   fd 3 read 538 bytes
> So select blocks (maybe because there's effectively nothing to read at this
> time), but instead of waking up when there's data ready it wakes up
> when the timeout expires. The next select call returns immediatly.
> Does it ring a bell to someone ? Any recent change in this area
> (either in select(2), or tcp) recently ?


does this only happen on SMP but not on UP?
I recognized some bad responsiveness of X11 if the Xorg server is not
bound to cpu0 where device interrupts occur.
This is due to different resched behaviour if rescheduling happens local
to the interrupted cpu vs cross cpu.
In the local case we have a resched on returning to user space, in the
cross case this is only true if the woken-up thread has a priority
higher then sched_upreempt_pri (normaly not the case).
This results in having a delay until the next interrupt on those cpus.
I'm currently testing a fix X11 is responsive regardless of choosen cpu.


