pkgsrc-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

pkg/53576: lang/erlang 21.0nb1 freezes with rebar3



>Number:         53576
>Category:       pkg
>Synopsis:       lang/erlang 21.0nb1 freezes with rebar3
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    pkg-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Sep 06 06:25:00 +0000 2018
>Originator:     Michael Taylor
>Release:        NetBSD 8.0
>Organization:
>Environment:
System: NetBSD dagonet.omniscient.local 8.0 NetBSD 8.0 (GENERIC) #0: Tue Jul 17 14:59:51 UTC 2018 mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64

>Description:

This started with Erlang tool rebar3 (http://www.rebar3.org/) freezing with
Erlang 21 where it did not on Erlang 20.3

After investiagtion I have found this happens since Erlang 21's implementation
of ports on NetBSD's kqueue can fail to receive the exit_status of external
processes spawned/forked with erlang:open_port/2.

Since Erlang 21 seems to already have an existing solution to the problem for
implemetations of kqueue on OpenBSD, I have submitted a report to Erlang/OTP:
  https://bugs.erlang.org/browse/ERL-725
  (ports fail to send exit_status on NetBSD)

I am also posting here for two purposes:

1. Since this issue relates to NetBSD's implementation of kqueue, perhaps
   someone with more knowledge can confirm the proposed solution is the most
   appropriate for Erlang on NetBSD.

2. My proposed patch may be added to pkgsrc whilst the Erlang team works on the
   issue from their end.

My understanding of Erlang's logic condensed to the bare essentials is:
- a child process is fork()ed and joined with pipe()s
- the child is marked as alive = 1
- the child process is monitored by SIGCHLD to obtain the exit status
- the output pipe of the child process is added to a kqueue()
- if the exit status arrives via SIGCHLD
  - the exit status is recorded
  - set alive = 0
- if read() initiated by a EVFILT_READ event returns 0 (EOF)
  - if alive == 0 then the eof/status pair are returned
  - if alive == 1 then re-add/re-enable output pipe to kqueue()

This logic allows the SIGCHLD and EOF to arrive in any order whilst having only
one completion path (returning the eof/status pair).

The above has two implementations: EV_DISPATCH and EV_ONESHOT. EV_DISPATCH is
used if available except for on OpenBSD. This pr suggests that EV_DISPATCH not
be used on NetBSD either. The two implementations are distinguished by:

EV_ONESHOT:
- add fd to kqueue
  EV_SET(&ev, fd, EVFILT_READ, EV_ADD|EV_ONESHOT, 0, 0, 0);
- re-add fd to kqueue
  EV_SET(&ev, fd, EVFILT_READ, EV_ADD|EV_ONESHOT, 0, 0, 0);

EV_DISPATCH:
- add fd to kqueue
  EV_SET(&ev, fd, EVFILT_READ, EV_ADD|EV_ENABLE|EV_DISPATCH, 0, 0, 0);
- re-enable fd in kqueue
  EV_SET(&ev, fd, EVFILT_READ, EV_ENABLE|EV_DISPATCH, 0, 0, 0);

In the EV_DISPATCH case, an EOF event is not returned a second time after
re-enabling the EVFILT_READ.

>How-To-Repeat:

Build and install lang/erlang 21.0nb1, you may need to apply solutions to
pkg/53567 (toolchain/53567) to do this.

Create the following file: erl-725.escript
NOTE: This escript is a reduction of the logic in rebar3
----
#!escript

main(_) ->
    Opts = [exit_status, {line, 16384}, use_stdio, stderr_to_stdout, hide, eof],
    Exec = {spawn, "/bin/echo hello"},
    Port = erlang:open_port(Exec, Opts),
    data(Port),
    erlang:port_close(Port).

data(Port) ->
    receive
        {Port, {data, Data}} ->
            io:format("data: ~p~n", [Data]),
            data(Port);
        {Port, eof} ->
            exit_status(Port)
    end.

exit_status(Port) ->
    receive
        {Port, {exit_status, ExitStatus}} ->
            io:format("exit status: ~p~n", [ExitStatus])
    end.
----

Execute the escript:
----
$ escript erl-725.escript
data: {eol,"hello"}
----
The escript will freeze.

This compares with a working system (Erlang 20.3 from pkgsrc) that executes
and completes almost immediately:
----
$ escript erl-725.escript
data: {eol,"hello"}
exit status: 0
$
----

>Fix:

The following patch was suggested in the ERL-725 bug report:
----
--- erts/emulator/sys/common/erl_poll.c.orig    2018-09-04 19:31:46.151738848 +1000
+++ erts/emulator/sys/common/erl_poll.c 2018-09-04 19:32:37.383828393 +1000
@@ -803,8 +803,8 @@
     struct kevent evts[2];
     struct timespec ts = {0, 0};

-#if defined(EV_DISPATCH) && !defined(__OpenBSD__)
-    /* If we have EV_DISPATCH we use it, unless we are on OpenBSD as the
+#if defined(EV_DISPATCH) && !(defined(__OpenBSD__) || defined(__NetBSD__))
+    /* If we have EV_DISPATCH we use it, unless we are on Open/NetBSD as the
        behavior of EV_EOF seems to be edge triggered there and we need it
        to be level triggered.

----


Home | Main Index | Thread Index | Old Index