Subject: Re: gaim 1.1.0 and MSN
To: Gavan Fantom <gavan@coolfactor.org>
From: Matthew Luckie <mjl@luckie.org.nz>
List: tech-pkg
Date: 01/04/2005 22:02:24
> >I'm the maintainer for gaim on NetBSD.  I'd like some help in fixing
> >a problem that is being reported when using gaim with MSN on NetBSD 1.6.
> >The bug is reported in PR 28690.
> 
> [...]
> 
> >I'm not sure how to address this problem.  Can someone spare me a clue
> >and help me with this problem?
> 
> Just a datapoint here - gaim 1.1.0 works fine with MSN on NetBSD 2.0_BETA.
> 
> Does ktrace show anything useful?

yes, it does.

ktrace reveals some kind of problem in libgcrypt, which is in
pkgsrc/security/libgcrypt (without a maintainer).

the relevant lines of the ktrace (provided by ben@) are as follows:

   446 gaim     1103731001.028883 NAMI  "/usr/bin/vmstat"
   446 gaim     1103731001.028892 RET   access 0
   446 gaim     1103731001.028899 CALL  pipe
   446 gaim     1103731001.028908 RET   pipe 4
   446 gaim     1103731001.028915 CALL  fork

   ....

   446 gaim     1103731001.054542 CALL  fcntl(0x4,0x3,0x4000)
   446 gaim     1103731001.054549 RET   fcntl 4
   446 gaim     1103731001.054556 CALL  fcntl(0x4,0x3,0)
   446 gaim     1103731001.054561 RET   fcntl 4
   446 gaim     1103731001.054568 CALL  read(0x4,0x82a0000,0x4000)
   446 gaim     1103731001.054579 RET   read 0
   446 gaim     1103731001.054587 CALL  close(0x4)
   446 gaim     1103731001.054594 RET   close 0

   ....

   446 gaim     1103731001.056254 CALL  read(0x4,0x811e628,0x80)
   446 gaim     1103731001.056262 RET   read -1 errno 9 Bad file descriptor
   446 gaim     1103731001.056270 CALL  __sigprocmask14(0x3,0x4859bcf8,0x811ecd8)
   446 gaim     1103731001.056278 RET   __sigprocmask14 0
   446 gaim     1103731001.056285 CALL  select(0x16,0x811ede8,0x811ed68,0x811ece8,0x811e490)
   446 gaim     1103731001.056293 RET   select -1 errno 9 Bad file descriptor
   446 gaim     1103731001.056300 CALL  __sigprocmask14(0x3,0x811ecd8,0)
   446 gaim     1103731001.056307 RET   __sigprocmask14 0
   446 gaim     1103731001.056318 CALL  select(0x17,0x811e5a8,0,0,0x811e490)
   446 gaim     1103731001.056326 RET   select 0
   446 gaim     1103731001.056418 CALL  __sigprocmask14(0x3,0x811ee58,0)
   446 gaim     1103731001.056426 RET   __sigprocmask14 0
   446 gaim     1103731001.056455 CALL  getpid
   446 gaim     1103731001.056462 RET   getpid 446/0x1be
   446 gaim     1103731001.056483 CALL  kill(0x1be, SIGABRT)
   446 gaim     1103731001.056494 PSIG  SIGABRT SIG_DFL
   446 gaim     1103731001.056507 NAMI  "gaim.core"

the problem is that for some reason, fd 4 remains in the set, despite the
read signalling eof with 0, and the fd being closed.  this causes fd 4 to
be a bad file descriptor, and read returning -1.

this corresponds with the 4th RI entry in libgcrypt-1.2.0/cipher/rndunix.c
which corresponds to /usr/bin/vmstat -c

that command, incidentially, will not return successfully, as -c requires a
parameter.  but i'm not sure that's the underlying problem.

i'm aware of http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=26079
but even if i remove that patch, the bug still remains.

i'm going to keep hunting.  but i'm not sure why all of a sudden gaim is
exposing what appears to be a problem in libgcrypt.  any clues?