[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/43199: read(2) returns bad size in multithreaded programs
>Synopsis: read(2) returns bad size in multithreaded programs
>Arrival-Date: Fri Apr 23 14:55:00 +0000 2010
>Originator: W. Stukenbrock
>Release: NetBSD 4.0
Dr. Nagler & Company GmbH
System: NetBSD s013 4.0 NetBSD 4.0 (NSW-S013) #26: Thu Apr 1 15:06:23 CEST 2010
We have some strange crashes (process and netbsd-hangups) while running
a parallel backup
A trace of the storage-daemon shows the following - this is only a
26576 5 bacula-sd 1272016811.942872799 read(0xe, 0x6a57fc, 0x487)
26576 4 bacula-sd 1272016811.944446138 read(0xe, 0x7f7ff93ff914,
0x4) = 4
26576 4 bacula-sd 1272016811.944594476 read(0xe, 0x69d040, 0x7ae0)
26576 4 bacula-sd 1272016811.945277784 read(0xe, 0x69e255, 0x68cb)
26576 1 bacula-sd 1272016811.946073113 read(0xe, 0x69f34d, 0x57d3)
26576 5 bacula-sd 1272016811.946129543 read(0xe, 0x69fe9d, 0x4c83)
26576 1 bacula-sd 1272016811.947484145 read(0xe, 0x6a1ae5, 0x303b)
26576 5 bacula-sd 1272016811.948707170 read(0xe, 0x6a372d, 0x13f3)
26576 4 bacula-sd 1272016811.949453053 read(0xe, 0x6a4825, 0x2fb)
26576 4 bacula-sd 1272016811.952093536 read(0xe, 0x7f7ff93ff914,
0x4) = 4
26576 1 bacula-sd 1272016811.952118119 read(0xe, 0x69d040, 0x7b3f)
26576 1 bacula-sd 1272016811.953248398 read(0xe, 0x69e6dc, 0x64a3)
26576 4 bacula-sd 1272016811.954791566 read(0xe, 0x69f7d4, 0x53ab)
The problem shows in the the call to read "read(0xe, 0x6a372d, 0x13f3)
5107 bytes requested, but 8688 bytes returned by NetBSD !!!!
remark: at this location the bad return is not detected by the program,
location it detects it and call exit(2).
The problem seem to happen if multiple threads are reading from the
same fd at the
same time. This does not make much sence of cause, but if an
to such a thing, the kernel may not run wild.
I have no possiblity to check if this problem is still in 5.x - sorry.
Currently the only known process to us that triggers the problem is the
bacula storage daemon.
The behavour of this process looks like misbehaviour to me, but
neverless the kernel may
never return more bytes as requested in the read(2) call!
not known up to now - still no time to go deeper into this
Main Index |
Thread Index |