Port-arm archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Very slow transfers to/from micro SD card on a RPi B+



There is another issue with the Pi which I recognized when /etc/daily
runs. The "find / -<blah blah>" jobs causes certain threads to hang
for a long time. That´s easily reproducable by issuing "find / >
/dev/null &". In the test I just made, I issued "ps" while the job
ran.

rpi/root> time ps
 PID TTY   STAT    TIME COMMAND
1742 pts/0 Ss   0:00.18 -ksh
7045 pts/0 D    0:05.68 find /
7195 pts/0 O+   0:00.01 ps
4734 pts/1 Is   0:00.04 -ksh
6804 pts/1 S+   0:00.02 iostat 1
2105 ttyE0 Is+  0:00.01 /usr/libexec/getty default console
    0.01s real     0.00s user     0.01s system

rpi/root> time ps
 PID TTY   STAT    TIME COMMAND
1742 pts/0 Is   0:00.18 -ksh
7045 pts/0 D    0:05.68 find /
7213 pts/0 O+   0:00.01 ps
4734 pts/1 Is   0:00.04 -ksh
6804 pts/1 S+   0:00.02 iostat 1
2105 ttyE0 Is+  0:00.01 /usr/libexec/getty default console
   43.89s real     0.00s user     0.01s system

This is strange because everything required should be in the caches
already. In the bad case, iostat indicates that there are almost no
xfers left to the SD card.

   6  400 3.130 1279 3.910   1  0  8  2 89
   4  374 2.553 1389 3.463   0  0  8  0 91
   6  400 4.238 1139 4.712   1  0 12  1 86
   5  387 32.92  192 6.175   0  0  1  1 97   ====>>> Beginning of the issue
   7  413 24.85  164 3.989   0  0  1  0 99
   4  374 16.00   97 1.516   0  0  1  1 98
   8  128 16.00   28 0.433   0  0  0  0 100
   1  355 16.00    1 0.015   0  0  0  1 99
   7  394 15.81  224 3.456   0  0  1  0 99
   3  380 16.00    1 0.015   0  0  0  1 99
   6  400 16.00    1 0.015   0  0  0  0 100
   5  388 16.00    1 0.015   0  0  0  1 99
   8  426 16.00    2 0.031   0  0  0  0 99
   4  374 16.00    6 0.093   0  0  0  1 99
   6  102 16.00   21 0.325   0  0  0  1 99
   1   47 16.00    2 0.031   0  0  0  0 100
   0   45 16.00    1 0.015   0  0  0  1 99
   0   44 16.00    1 0.015   0  0  0  0 100
   0   45 16.00   60 0.944   0  0  0  1 99
   0   44 16.00    2 0.031   0  0  0  0 100
   0   45 16.00    1 0.015   0  0  0  1 99
   0   44 16.00   23 0.356   0  0  0  0 100
      tty              ld0             CPU
 tin tout  KB/t  t/s  MB/s  us ni sy in id
   0   45 16.00    1 0.015   0  0  0  1 99
   0  131 16.00    6 0.093   0  0  0  0 100
   0   45 16.00   12 0.186   0  0  0  1 99
   0   44 16.00    1 0.015   0  0  0  1 99
   0   44 16.00    2 0.031   0  0  0  0 100
   0   45 16.00    2 0.031   0  0  0  1 99
   0   44 16.00    2 0.031   0  0  0  0 100
   0   45 16.00    3 0.046   0  0  0  1 99
   0   44 16.00    7 0.108   0  0  0  0 100
   0   45 16.00    2 0.031   0  0  0  1 99
   0   44 16.00    2 0.031   0  0  0  0 100
   0   45 0.000    0 0.000   0  0  0  1 99
   0   44 16.00    7 0.108   0  0  0  1 99
   0   44 16.00    2 0.031   0  0  0  0 100
   0   45 0.000    0 0.000   0  0  0  1 99
   0   44 16.00    7 0.108   0  0  0  0 100
   0   45 16.00    8 0.124   0  0  0  1 99
   0   44 0.000    0 0.000   0  0  0  0 100
   0   45 16.00    1 0.015   0  0  0  1 99
   0   44 16.00    4 0.062   0  0  0  0 100
   0   45 0.000    0 0.000   0  0  0  1 99
   0   44 16.00    3 0.046   0  0  0  0 100
   0   45 16.00    2 0.031   0  0  0  1 99
   0   44 0.000    0 0.000   0  0  0  1 99
   0   44 16.00    5 0.077   0  0  0  0 100
   0   45 16.00    2 0.031   0  0  0  1 99
   0   44 16.00    1 0.015   0  0  0  0 100
   0   45 16.00   11 0.170   0  0  0  1 99
   0   44 16.00    2 0.031   0  0  0  0 100
   0   45 0.000    0 0.000   0  0  0  1 99
   0   44 16.00    4 0.062   0  0  0  0 100
   0   45 16.00    3 0.046   0  0  0  1 99
   0   44 0.000    0 0.000   0  0  0  1 99
   0   44 16.00   52 0.820   0  0  0  0 100
   0   45 13.20    5 0.064   0  0  0  1 99
   0   44 0.000    0 0.000   0  0  0  0 100
   0  352 12.25   54 0.651   2  0  6  2 90

I can´t enter ddb(4) just now because of a bug in the USB driver which
I haven´t patched yet. I can tell is that the "ps" process is in
bioloc, while find is in biowai.

ps´s stack looks like that from crash:

crash> t/a b9c0cc00
trace: pid 7618 lid 1 at 0xb855994c
0xb855994c: mi_switch+0xc
0xb8559974: sleepq_block+0xac
0xb855999c: cv_timedwait+0xa0
0xb85599c4: bbusy+0x94
0xb85599f4: getblk+0x54
0xb8559a24: bio_doread.isra.4+0x30
0xb8559a4c: bread+0x24
0xb8559acc: ffs_loadvnode+0x194
0xb8559b34: vcache_get+0x34c
0xb8559bec: ufs_lookup+0x7b4
0xb8559c14: VOP_LOOKUP+0x44
0xb8559c64: lookup_once+0x174
0xb8559d34: namei_tryemulroot+0x4d8
0xb8559d64: namei+0x50
0xb8559e3c: vn_open+0x94
0xb8559ebc: do_open+0xb0
0xb8559eec: do_sys_openat+0x7c
0xb8559f14: sys_open+0x34
0xb8559f84: syscall+0x88
0xb8559fac: swi_handler+0x8c
crash>

This is find´s stack:

crash> t/a ad270ee0
trace: pid 6247 lid 1 at 0xad86dc24
0xad86dc24: mi_switch+0xc
0xad86dc4c: sleepq_block+0xac
0xad86dc74: cv_wait+0x9c
0xad86dc8c: biowait+0x34
0xad86dcb4: bread+0x3c
0xad86dd64: ffs_read+0x398
0xad86dd94: VOP_READ+0x48
0xad86de24: ufs_readdir+0xe0
0xad86de64: VOP_READDIR+0x58
0xad86dedc: vn_readdir+0xcc
0xad86df14: sys___getdents30+0x80
0xad86df84: syscall+0x88
0xad86dfac: swi_handler+0x8c


Any ideas?

2016-01-04 16:59 GMT+01:00 Stephan <stephanwib%googlemail.com@localhost>:
> While I was already sure that changing the I/O scheduler from disksort
> to fcfs on the Pi increases sequential write speed from nearly 10MB/s
> to 15 MB/s on average, it turned out that´s not true.
>
> In fact, the writes speed varies for no recognizable reason. An
> example of a "good" case:
>
> rpi/root> dd if=/dev/zero of=myfile bs=4k count=10000
> 10000+0 records in
> 10000+0 records out
> 40960000 bytes transferred in 2.750 secs (14894545 bytes/sec)
>
>       tty              ld0             CPU
>  tin tout  KB/t  t/s  MB/s  us ni sy in id
>
>    5   49 18.97   18 0.330   0  0  0  0 100
>    4   49 0.000    0 0.000   0  0  0  1 99
>    2   46 0.000    0 0.000   0  0  0  0 100
>    1   48 55.83   12 0.648   0  0  2  1 97
>    0   44 63.55  212 13.15   0  0  8  0 92
>    0   44 64.00  194 12.13   0  0  6  1 93
>    0  156 63.54  206 12.78   0  0  5  0 95
>    0   44 0.000    0 0.000   0  0  0  0 100
>    0   45 0.000    0 0.000   0  0  0  1 99
>
> This is an example of a "bad" case:
>
> rpi/root> dd if=/dev/zero of=myfile bs=4k count=10000
> 10000+0 records in
> 10000+0 records out
> 40960000 bytes transferred in 4.718 secs (8681644 bytes/sec)
>
>   1   46 0.000    0 0.000   0  0  0  1 99
>    6  266 0.000    0 0.000   0  0  1  0 99
>    8  148 0.000    0 0.000   0  0  2  1 97
>    1   47 63.54   69 4.301   0  0  5  0 95
>    0   44 64.00  234 14.60   0  0  7  2 90
>    0   44 62.60  142 8.656   0  0  3  0 97
>    0   44 0.000    0 0.000   0  0  0  0 100
>    0  156 58.02  185 10.49   0  0  2 11 86
>    0   44 64.00   18 1.114   0  0  0  0 100
>    0   45 0.000    0 0.000   0  0  0  1 99
>    0   44 0.000    0 0.000   0  0  0  0 100
>
>
> There is even one second in the process without a single xfer. Can
> this be caused by the SD card itself, or may there be something
> improvable in the kernel?
>
>
>
> 2015-08-25 15:05 GMT+02:00 Stephan <stephanwib%googlemail.com@localhost>:
>> Nobody having an idea about DIRHASH? I´ve also seen that tmpfs has the
>> same regression.
>>
>>
>> I corrected 2 minor flaws in my tool, if it´s useful to somebody:
>>
>>
>> ------8<---------------------------------
>>
>> #include <fcntl.h>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <sys/time.h>
>>
>> int main(int argc, char **argv) {
>>   struct timeval tv_start, tv_finish;
>>   unsigned long duration;
>>   int i;
>>   char filename[16];
>>
>>   if (argc < 2) { printf("Usage: %s <count>\n", argv[0]); exit(1); }
>>   int howmany = atoi(argv[1]);
>>
>>   for (i = 0; i < howmany; i++)
>>   {
>>     snprintf(filename, sizeof(filename), "%d", i);
>>     gettimeofday(&tv_start, NULL);
>>     int fd = open(filename, O_CREAT);
>>     gettimeofday(&tv_finish, NULL);
>>     close(fd);
>>     duration = (tv_finish.tv_sec * 1000000 + tv_finish.tv_usec) -
>> (tv_start.tv_sec * 1000000 + tv_start.tv_usec);
>>     printf("Duration: %lu\n", duration);
>>   }
>>
>>   return 0;
>> }
>>
>> 2015-08-20 18:16 GMT+02:00 Stephan <stephanwib%googlemail.com@localhost>:
>>> 2015-08-19 15:51 GMT+02:00 Thor Lancelot Simon <tls%panix.com@localhost>:
>>>> We should be shipping with DIRHASH turned on.
>>>
>>>
>>> That´s it - top speed!
>>>
>>> /root/test/files> time seq 1 30000|xargs touch
>>>     1.05s real     0.03s user     0.84s system
>>>
>>> Is DIRHASH still considered experimental? It fixes one of NetBSD´s
>>> most annoying shortcomings.
>>>
>>> FreeBSD received dynamic memory allocation for DIRHASH which might be
>>> even better.
>>>
>>>
>>>
>>> (re-sending this as my editor intermittently distributes HTML mails)
>>>
>>>
>>>>
>>>> On Tue, Aug 18, 2015 at 04:19:57PM +0000, Michael van Elst wrote:
>>>>> stephanwib%googlemail.com@localhost (Stephan) writes:
>>>>>
>>>>> >This is from an amd64 server box with an aac raid controller.
>>>>>
>>>>> >/root/test/files> time seq 1 10000|xargs touch
>>>>> >    3.10s real     0.01s user     3.07s system
>>>>> >/root/test/files> rm *
>>>>> >/root/test/files> time seq 1 20000|xargs touch
>>>>> >    9.88s real     0.01s user     8.51s system
>>>>> >/root/test/files> rm *
>>>>> >/root/test/files> time seq 1 30000|xargs touch
>>>>> >   23.45s real     0.04s user    20.41s system
>>>>> >/root/test/files> time seq 1 40000|xargs touch
>>>>> >   43.35s real     0.05s user    38.32s system
>>>>>
>>>>> >That is clearly not linear.
>>>>>
>>>>> Each lookup is linear (unless you use DIRHASH), so the total
>>>>> time goes with n^2.
>>>>>
>>>>> --
>>>>> --
>>>>>                                 Michael van Elst
>>>>> Internet: mlelstv%serpens.de@localhost
>>>>>                                 "A potential Snark may lurk in every tree."
>>>>
>>>> --
>>>>   Thor Lancelot Simon                                        tls%panix.com@localhost
>>>>
>>>>   "We cannot usually in social life pursue a single value or a single moral
>>>>    aim, untroubled by the need to compromise with others."      - H.L.A. Hart


Home | Main Index | Thread Index | Old Index