Subject: pthreads woes with i386 -current
To: None <current-users@netbsd.org>
From: Sean Doran <smd@ebone.net>
List: current-users
Date: 01/17/2001 16:46:34
Oops.  -current and preemptive pthreads don't seem to get
along at all.  I have the impression people don't really
play with threaded userland processes much...

I'll try to explain one of the clearer symptoms using a
simple example.

1.5Q Tue Jan 16, -i386.

First, I got the code from:

http://www.mit.edu/people/proven/IAP_2000/basic_example.html

and compile it against pkgsrc/devel/mit-pthreads thusly:

: sean ; env PATH=/usr/pkg/pthreads/bin:$PATH pgcc -o x x.c 

so far, so good

: sean ; ./x
abababab^?
: sean ; stty -a
... intr = ^? ... status = ^T ...
: sean ; ./x
ababababload: 2.35  cmd: z 26692 [runnable] 0.01u 0.00s 0% 700k

And there it sits, eating enormous CPU time, immune to
all catchable signals.   A ktrace/ktruss reveals only
what's below - nothing further happens in the process
until it gets an uncatchable signal.

So I try again with pkgsrc/devel/unproven-pthreads,
following the same build approach.  Same result: a hanged
program, with the last thing being seen being the ioctls
at the end.

There is another interesting issue:

sean% ./z
ababababababababab^Z
Suspended
sean% fg
./z
a
Suspended
sean% fg
./z
bababababa^?
sean%

Again, this happens the same whether it's mit-pthreads or unproven-pthreads.

If I compile against pkgsrc/devel/pth thusly:

: sean ; cc -I/usr/pkg/include -o z1 ./z.c -Wl,-R/usr/pkg/lib -L/usr/pkg/lib  -lpthread
: sean ; ./z1
aaload: 2.21  cmd: z1 27555 [nanosleep] 0.01u 0.00s 0% 696k
aaaload: 2.21  cmd: z1 27555 [nanosleep] 0.01u 0.00s 0% 696k
aaa^Ca^?
smd% ./z1
aaaload: 2.34  cmd: z1 27556 [nanosleep] 0.00u 0.00s 0% 696k
aaaaaload: 2.31  cmd: z1 27556 [nanosleep] 0.00u 0.00s 0% 696k
aa^?

I also don't get the problem with SIGTSTP, but consider this:

[z is compiled and linked against mit-pthreads-1.60b6, 
z1 is compiled and linked with pth-1.3.7]


Note 60 seconds worth of "ab" versus 2 minutes (60s of "a"
vs 60s of "b", and the funny free error, reported because
/etc/malloc.conf contains "A").

smd% time ./z; time ./z1
abababababababababababababababababababababababababababababababababababababababababababababababababababababababababababab0.0u 0.0s 1:00.59 0.0% 0+0k 0+2io 0pf+0w
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbz1 in free(): error: page is already free.
Abort trap (core dumped)
0.0u 0.0s 2:01.19 0.0% 0+0k 0+2io 0pf+0w

#0  0x48113633 in kill ()
#1  0x48112577 in abort ()
#2  0x481106b3 in __divdi3 ()
#3  0x48111cf3 in __divdi3 ()
#4  0x4811233b in free ()
#5  0x4807575c in __pth_tcb_free ()

The sequential behaviour of pth isn't very astonishing,
since the homepage in the Makefile says here and there
that pth does not do preemptive multithreading.

The mit-pthreads package's homepage link is stale.

I don't suppose this is something at all fixable, such
that one or the other preemptive pthreads implementation
is a little less fragile?

        Sean.
- --
[mit-pthreads, dealing with SIGINFO via ^T]

       "a"
 26710 z        gettimeofday(0x80624e4, 0)         = 0
 26710 z        setitimer(0, 0x8062464, 0)         = 0
 26710 z        setitimer(0x1, 0x804c9a0, 0)       = 0
 26710 z        write(0x1, 0x8048ea8, 0x1)         = 1
       "b"
 26710 z        gettimeofday(0x80734e4, 0)         = 0
 26710 z        __sigprocmask14(0x1, 0x8073414, 0x8073404) = 0
 26710 z        __sigprocmask14(0x2, 0x8073414, 0x8073404) = 0
 26710 z        __sigprocmask14(0x1, 0x8073304, 0x80732e4) = 0
SIGALRM caught handler=0x4808c3ec mask=0xfffefeff code=0x0
ab 26710 z                                           Err#4 EINTR
 26710 z        __sigreturn14(0x8073258)           JUSTRETURN
 26710 z        __sigprocmask14(0x2, 0x8073304, 0x80732e4) = 0
 26710 z        __sigprocmask14(0x1, 0x8073414, 0x8073404) = 0
 26710 z        gettimeofday(0x8073374, 0)         = 0
 26710 z        __sigprocmask14(0x2, 0x8073414, 0x8073404) = 0
 26710 z        setitimer(0x1, 0x804c7a0, 0)       = 0
 26710 z        write(0x1, 0x8048ea7, 0x1)         = 1
       "a"
 26710 z        gettimeofday(0x80624e4, 0)         = 0
 26710 z        setitimer(0, 0x8062464, 0)         = 0
 26710 z        setitimer(0x1, 0x804c9a0, 0)       = 0
 26710 z        write(0x1, 0x8048ea8, 0x1)         = 1
       "b"
 26710 z        gettimeofday(0x80734e4, 0)         = 0
 26710 z        __sigprocmask14(0x1, 0x8073414, 0x8073404) = 0
 26710 z        __sigprocmask14(0x2, 0x8073414, 0x8073404) = 0
 26710 z        __sigprocmask14(0x1, 0x8073304, 0x80732e4) = 0
load: 2.58  cmd: z 26710 [runnable] 0.00u 0.01s 0% 700k
SIGINFO caught handler=0x4808c3ec mask=0xfffefeff code=0x0
 26710 z                                           Err#4 EINTR
 26710 z        __sigreturn14(0x8073258)           JUSTRETURN
 26710 z        __sigprocmask14(0x2, 0x8073304, 0x80732e4) = 0
 26710 z        __sigprocmask14(0x1, 0x8073414, 0x8073404) = 0
 26710 z        fcntl(0x2, 0x3, 0)                 = 2
 26710 z        ioctl(0x2, TIOCGETA, 0x8073278)    = 0