NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/43199: read(2) returns bad size in multithreaded programs
>Number: 43199
>Category: kern
>Synopsis: read(2) returns bad size in multithreaded programs
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Apr 23 14:55:00 +0000 2010
>Originator: W. Stukenbrock
>Release: NetBSD 4.0
>Organization:
Dr. Nagler & Company GmbH
>Environment:
System: NetBSD s013 4.0 NetBSD 4.0 (NSW-S013) #26: Thu Apr 1 15:06:23 CEST 2010
segrassl@s012:/export/NetBSD-4.0/N+C-build/.OBJDIR_amd64/export/NetBSD-4.0/src/sys/arch/amd64/compile/NSW-S013
amd64
Architecture: x86_64
Machine: amd64
>Description:
We have some strange crashes (process and netbsd-hangups) while running
a parallel backup
with bacula.
A trace of the storage-daemon shows the following - this is only a
small extract:
26576 5 bacula-sd 1272016811.942872799 read(0xe, 0x6a57fc, 0x487)
= 1159
26576 4 bacula-sd 1272016811.944446138 read(0xe, 0x7f7ff93ff914,
0x4) = 4
26576 4 bacula-sd 1272016811.944594476 read(0xe, 0x69d040, 0x7ae0)
= 4629
26576 4 bacula-sd 1272016811.945277784 read(0xe, 0x69e255, 0x68cb)
= 4344
26576 1 bacula-sd 1272016811.946073113 read(0xe, 0x69f34d, 0x57d3)
= 4
26576 5 bacula-sd 1272016811.946129543 read(0xe, 0x69fe9d, 0x4c83)
= 5788
26576 1 bacula-sd 1272016811.947484145 read(0xe, 0x6a1ae5, 0x303b)
= 7240
26576 5 bacula-sd 1272016811.948707170 read(0xe, 0x6a372d, 0x13f3)
= 8688
26576 4 bacula-sd 1272016811.949453053 read(0xe, 0x6a4825, 0x2fb)
= 1448
26576 4 bacula-sd 1272016811.952093536 read(0xe, 0x7f7ff93ff914,
0x4) = 4
26576 1 bacula-sd 1272016811.952118119 read(0xe, 0x69d040, 0x7b3f)
= 5788
26576 1 bacula-sd 1272016811.953248398 read(0xe, 0x69e6dc, 0x64a3)
= 4344
26576 4 bacula-sd 1272016811.954791566 read(0xe, 0x69f7d4, 0x53ab)
= 11584
The problem shows in the the call to read "read(0xe, 0x6a372d, 0x13f3)
= 8688".
5107 bytes requested, but 8688 bytes returned by NetBSD !!!!
remark: at this location the bad return is not detected by the program,
at other
location it detects it and call exit(2).
The problem seem to happen if multiple threads are reading from the
same fd at the
same time. This does not make much sence of cause, but if an
application decides
to such a thing, the kernel may not run wild.
I have no possiblity to check if this problem is still in 5.x - sorry.
>How-To-Repeat:
Currently the only known process to us that triggers the problem is the
bacula storage daemon.
The behavour of this process looks like misbehaviour to me, but
neverless the kernel may
never return more bytes as requested in the read(2) call!
>Fix:
not known up to now - still no time to go deeper into this
>Unformatted:
Home |
Main Index |
Thread Index |
Old Index