tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: pwait(1) added

Joerg Sonnenberger <> writes:

> On Sat, Mar 07, 2015 at 07:08:53PM -0500, James K. Lowden wrote:
>> On Fri, 6 Mar 2015 13:16:18 +0100
>> Joerg Sonnenberger <> wrote:
>> > > Taking the second problem first, ISTM that doesn't require anything
>> > > fancy but requires information of what's "expected".  If you build a
>> > > database of successful build-times, then cancelling stalled builds
>> > > could surely be accomplished by enregistering the start of each
>> > > package's build process, and periodically patrolling the tree for
>> > > cases when ".done" or whatever wasn't produced in the expected
>> > > time.  
>> > 
>> > Problem with such databases is that they need maintainance, explosions
>> > in build time are not uncommon, even more on transistions from failure
>> > to success. That's what makes the "doesn't make progress for a while"
>> > metric so interesting -- it can work reliably without knowing anything
>> > about the build in advance.
>> So, IIUC, what you're saying is that you'd like to monitor the build
>> process and take note of ... what?  "Doesn't make progress" isn't
>> interesting; it's impossible because too vague.  The process is doing
>> something.  Are you going to assume that because there's no I/O after N
>> minutes that the process is stalled?  
> We already have a measure to terminate processes that "do something",
> ulimit -t. So if the process is actually using CPU time, it can be
> killed without manual intervention. I'm also not really concerned about
> fork bombs, I haven't seen such a problem yet. What I have seen is
> processes stuck waiting for something to happen. That can be a kind of
> zombie with the wrong PID or a dead lock in a multi-threaded program
> (I'm looking at you mono!). Forking is a signal of life from a process,
> so monitoring it seems to be a pretty reliable way of dealing with the
> issues I have seen.

>> I recognize you have a lot of experience in this area.  At the same
>> time, I doubt the assertion that history is no guide to the present.
>> I'm skeptical that *successful* build-times vary much on a given
>> machine, surely not by 1 standard deviation.  Since we have no database
>> of build-time history, I'm not sure on what basis you disagree.  
> Yes, that's exactly the problem. Most of the problematic builds are not
> successful :)

What would be really nice is ability to impose real time limit.
Real time limit would solve this problem.


Home | Main Index | Thread Index | Old Index