Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: df stuck in tstile



I think I'm probably out of my depth on this one.  I used to get this
kind of problem on the machines these replace, and became convinced
that that was a bad block issue (which got aired on one lof the other
lists - they were x86 macnines).  Once there was one process stuck in
tstile, the problem began to accumulate, and the only solution was a
periodic reboot.  IIRC the problem started with a df.

Unfortunately, the version of crash(8) that came with 8.0_RC1 won't work.
They were clean istallations onto brand-new machines, but I get a
"versioins differ" error message when I run crash.

But I don't really have a handle on what might be going on - I can
only guess at what tstile might be, and suppose it is some kind of
queuing mechanism in the kernel.   Is there any way of killing a
process that gets stuck in this way?  kill -9 does nothing, and
killing the parent process just means that the stuck process is
parented directly by process 1.

These servers are providing services to customers around the clock
(the live service is not actually impacted), and live in a data
centre, so I need to understand the nature of the problem before I
start makiang changes that would lead to down time.

--
Steve Blinkhorn <steve%prd.co.uk@localhost>

You wrote:
> 
> In article <20190130112210.DF067B2FB0D%viking.prd.co.uk@localhost>,
> Steve Blinkhorn <steve%prd.co.uk@localhost> wrote:
> >I have a pair of identically-configured (except for IP-related and
> >userland data) 8.0_RC1 servers, with software RAID 1 file systems.
> >On both of them, df hangs in tsile; on one of the a find coming from
> >/etc/dail is hung in tstile.   When I configured the systems last
> >July/August, df did not hang.
> 
> You can use crash(8) to figure out the kernel stack trace where the process
> is hanging. Perhaps that will help.
> 
> christos
> 
> 




Home | Main Index | Thread Index | Old Index