pkgsrc-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: firefox-112



> From: Izumi Tsutsui <tsutsui%ceres.dti.ne.jp@localhost>
> Date: Sat, 22 Apr 2023 23:02:35 +0900
> 
> (gdb) bt
> [...]
> #3  0xb523ab2c in WasmTrapHandler(int, siginfo*, void*) () from /usr/pkg/lib/firefox/libxul.so
> #4  <signal handler called>
> #5  0xa202de3c in _mesa_GetError () from /usr/X11R7/lib/modules/dri/i965_dri.so

On amd64, this corresponds to the following instruction:

0000000000000697 <_mesa_GetError>:
_mesa_GetError():
/home/riastradh/netbsd/current/src/../xsrc/external/mit/MesaLib.old/dist/src/mesa/main/getstring.c:327
 697:   64 48 8b 14 25 00 00    mov    %fs:0x0,%rdx
 69e:   00 00
 6a0:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 6a7 <_mesa_GetError+0x10>                
                        6a3: R_X86_64_GOTTPOFF  _glapi_tls_Context-0x4
 6a7:   48 8b 3c 02             mov    (%rdx,%rax,1),%rdi
/home/riastradh/netbsd/current/src/../xsrc/external/mit/MesaLib.old/dist/src/mesa/main/getstring.c:329
 6ab:   83 bf 78 05 00 00 0f    cmpl   $0xf,0x578(%rdi)

It's the  cmpl $0xf,0x578(%rdi)  instruction that's crashing (and the
SIGSEGV is then caught by Firefox's signal handler).  That's from
ASSERT_OUTSIDE_BEGIN_END_WITH_RETVAL:

_mesa_GetError( void )
{
   GET_CURRENT_CONTEXT(ctx);
   GLenum e = ctx->ErrorValue;
   ASSERT_OUTSIDE_BEGIN_END_WITH_RETVAL(ctx, 0);
...

#define ASSERT_OUTSIDE_BEGIN_END_WITH_RETVAL(ctx, retval)               \
do {                                                                    \
   if (_mesa_inside_begin_end(ctx)) {                                   \
...

_mesa_inside_begin_end(const struct gl_context *ctx)
{
   return ctx->Driver.CurrentExecPrimitive != PRIM_OUTSIDE_BEGIN_END;

(gdb) print &((struct gl_context *)0)->Driver.CurrentExecPrimitive
$3 = (GLuint *) 0x578

This matches:

 6ab:   83 bf 78 05 00 00 0f    cmpl   $0xf,0x578(%rdi)

My guess is that %rdi, i.e., ctx, holds a null pointer, meaning that
the current GL context is null.

The value in %rax, -6232, matches what ld.elf_so (compiled with
-DDEBUG and run with LD_DEBUG=1) prints as the tls offset for
libGL.so, 6232 (which is subtracted from the tcb allocation in
ld.elf_so/tls.c _rtld_tls_allocate_locked, hence the opposite sign).

Based on that, my guess is that something changed between Firefox 111
and Firefox 112 about setting and/or clearing the current GL context.

A diff between 111.0.1 and 112.0.1 reveals a lot of changes involving
MakeCurrent under gfx/angle -- perhaps somewhere in those changes
there is a bug leaving the context null where it wasn't in 111, but
somehow the change only affects NetBSD?


Another guess under discussion last night was that Mesa overflowed the
static TLS reservation slop, but I summed the sizes of all the .tbss
sections in all *.so* files under /usr/X11R7 and /usr/pkg with static
TLS, and it sums to 64, exactly what rtld hard-codes as the limit it
preallocates for static TLS in dlopen, RTLD_STATIC_TLS_RESERVATION.
(None of the objects with static TLS, ascertained by `readelf -d ... |
grep STATIC_TLS', had .tdata sections or other sections with T flags.)


Home | Main Index | Thread Index | Old Index