pkgsrc-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: seg fault building npm

Turns out a simple example that crashes isn't too tricky:

const tls = require('tls');

var socket = tls.connect(443, '', {}, () => {
    console.log ("connected" );

This triggers a floating point exception:
Thread 1 received signal SIGFPE, Arithmetic exception.
0x0000000000a451d8 in bn_div_words ()
(gdb) bt
#0  0x0000000000a451d8 in bn_div_words ()
#1  0x0000000000a46f85 in BN_div ()
#2  0x0000000000959687 in BN_MONT_CTX_set ()
#3  0x0000000000959818 in BN_MONT_CTX_set_locked ()
#4  0x00000000009a6c2d in rsa_ossl_public_decrypt ()
#5  0x00000000009a9fef in int_rsa_verify ()
#6  0x00000000009aa309 in RSA_verify ()
#7  0x00000000009a8e89 in pkey_rsa_verify ()
#8  0x0000000000988b16 in EVP_DigestVerifyFinal ()
#9  0x0000000000a3f86b in ASN1_item_verify ()
#10 0x00000000009b89cb in internal_verify ()
#11 0x00000000009ba3da in verify_chain ()
#12 0x00000000009bab2f in X509_verify_cert ()
#13 0x0000000000927249 in ssl_verify_cert_chain ()
#14 0x0000000000935c2e in tls_process_server_certificate ()
#15 0x0000000000933d9a in state_machine ()
#16 0x000000000091fc34 in ssl3_read_bytes ()
#17 0x00000000009243f5 in ssl3_read_internal ()
#18 0x000000000092c026 in SSL_read ()
#19 0x000000000090a80d in node::TLSWrap::ClearOut() ()
#20 0x000000000090ae76 in node::TLSWrap::OnStreamRead(long, uv_buf_t const&) ()
---Type <return> to continue, or q <return> to quit---
#21 0x00000000008c0502 in node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) ()
#22 0x00007e7317016a77 in uv__read (stream=stream@entry=0x7e7316159258)
    at src/unix/stream.c:1257
#23 0x00007e73170172b1 in uv__stream_io (loop=<optimized out>,
    w=0x7e73161592e0, events=1) at src/unix/stream.c:1324
#24 0x00007e731701b218 in uv__io_poll (
    loop=loop@entry=0x7e7317224b20 <default_loop_struct>, timeout=-1)
    at src/unix/kqueue.c:343
#25 0x00007e731700e8e3 in uv_run (loop=0x7e7317224b20 <default_loop_struct>,
    mode=UV_RUN_DEFAULT) at src/unix/core.c:370
#26 0x0000000000a9286e in node::Start(v8::Isolate*, node::IsolateData*, std::vector<std::string, std::allocator<std::string> > const&, std::vector<std::string, std::allocator<std::string> > const&) ()
#27 0x000000000083936b in node::Start(int, char**) ()
#28 0x00000000008030fb in ___start ()
#29 0x00007f7e55003382 in _rtld () from /usr/libexec/ld.elf_so
#30 0x00007f7fffad700e in ?? ()
#31 0x00007f7fffad7019 in ?? ()
#32 0x00007f7fffad7021 in ?? ()
#33 0x00007f7fffad7032 in ?? ()
---Type <return> to continue, or q <return> to quit---
#34 0x00007f7fffad7058 in ?? ()
#35 0x0000000000000000 in ?? ()

It does fault on a div instruction:
   0x0000000000a451cb <+0>:     push   %rbp
   0x0000000000a451cc <+1>:     mov    %rsp,%rbp
   0x0000000000a451cf <+4>:     mov    %rdx,%rcx
   0x0000000000a451d2 <+7>:     mov    %rsi,%rax
   0x0000000000a451d5 <+10>:    mov    %rdi,%rdx
=> 0x0000000000a451d8 <+13>:    div    %rcx
   0x0000000000a451db <+16>:    pop    %rbp
   0x0000000000a451dc <+17>:    retq

Although I'm not sure why that div instruction would fault given the registers at this point are:
(gdb) info registers
rax            0x33b15e9e18bf20c6       3724862399925133510
rbx            0xdfafe99750088357       -2328385646235188393
rcx            0xdfafe99750088357       -2328385646235188393
rdx            0xe5018414db825d94       -1945128338930377324
rsi            0x33b15e9e18bf20c6       3724862399925133510
rdi            0xe5018414db825d94       -1945128338930377324
rbp            0x7f7fffac7160   0x7f7fffac7160
rsp            0x7f7fffac7160   0x7f7fffac7160
r8             0x22     34
r9             0xb62e7d35b66fdd1d       -5319176440230453987
r10            0x1      1
r11            0x6ad62cc0       1792421056
r12            0x7e73161409a8   139032756750760
r13            0xb4cc6265f69082ec       -5418848061565664532
r14            0xd07facfb6eecdc51       -3422826995880502191
r15            0x7e7316fd6418   139032772043800
rip            0xa451d8 0xa451d8 <bn_div_words+13>

I'm not an x64 assembler expert but what documentation I can find doesn't suggest anthing out of the ordinary should occur with these register values.


On 09/11/2018 12:42, Mike Pumford wrote:
On 09/11/2018 12:00, Chavdar Ivanov wrote:
Any more ideas about this nodejs fault? Latest firefox builds now
require nodejs to be installed.

Does it need npm? nodejs itself seems to work okay at least for basic operation.
node panics in ecp_nistz256_points_mul, part of the built-in
dependence openssl. The used one is 1.1.0, whereas ours is 1.1.1, I
would presume patched appropriately and tested as part of the main
system (this is under -current as of a couple of days ago).

Sadly no idea as I've not had much chance to look. I'd agree that it is something to do with how nodejs interacts with the openssl code though.

I tried using yarn instead of npm and whilst yarn was able to install itself it blew up doing a https download albeit with a more detailed call. Gut instinct (but no actual evidence) suggests that something isn't being set up right when the interpreter transitions from javascript code to native code. Given how messed up the stack is in my original debug trace suggests it. The yarn failure actually had a full stack trace and was in SSL_write if i remember correctly.

I'm on 8-stable so the native openssl is 1.0.2k. There is a configure argument that tells it to use an external SSL (and it seemed to be happy with the system provided one at configure time). However nodejs then didn't compile as it tried to set up a pointer to a function declared as inline in an openssl header file (although i did think that was legal in C++).

I've got some free time today so I might see if I can come up with a simpler bit of code that can cause breakage.


Home | Main Index | Thread Index | Old Index