pkgsrc-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: pkg/57770: pkgtools/pbulk-base pbulk-build segfault



spz%NetBSD.org@localhost writes:

>[1]   Segmentation fault (core dumped) ${pbuild} -r ${loc}/pbuild -I ${pbuild_start_s...

>#11 0x000000004820372f in kill_peer (arg=arg@entry=0x754057a56250)
>    at master.c:87
>#12 0x0000000048203638 in assign_job (arg=<optimized out>) at master.c:211
>#13 0x0000000048203741 in kill_peer (arg=arg@entry=0x754057a56250)
>    at master.c:91


Looks like memory corruption, maybe from pbuild/master.c:

static void
child_handler(struct signal_event *ev)
{
        struct build_peer *peer;
        int status;

        if (waitpid(child_pid, &status, WNOHANG) == -1) {
                if (errno == ECHILD)
                        return;
                err(1, "Could not wait for child");
        }
        if (status != 0)
                err(1, "Start script failed");

        clients_started = 1;
        signal_del(ev);

        if ((peer = LIST_FIRST(&inactive_peers)) != NULL) {
                LIST_REMOVE(peer, peer_link);
                assign_job(peer);
        }
}


The assign_job function is already responsible for removing the
peer, removing it here may trigger a second LIST_REMOVE.

This would also explain why the bug happens short after start
and only sometimes when the start script has finished while
there are still inactive peers.



Home | Main Index | Thread Index | Old Index