Re: tstile in less than a day

To: Martin Husemann <martin%duskware.de@localhost>
Subject: Re: tstile in less than a day
From: John Klos <john%ziaspace.com@localhost>
Date: Mon, 15 Apr 2019 22:50:12 +0000 (UTC)

But whatever the actual bug here is, when NFS is involved we need proper
pkt traces to analyze it. You can only see things tstile on a vnode
lock in the kernel backtraces or lockdebug output, but you can not see
why the IO does not complete and the lock is not freed.

It wouldn't be easy to get a pcap, but it's possible. It sometimes takes aweek, sometimes a day.

What's strange is that that shortly after the issue, I ran a kernel withLOCKDEBUG. Had no issues for more than a month. Since there was a lot ofoverhead with that, I removed that, and got a lockup via NFS within fourdays.

The same NFS server (amd64, NetBSD 8) serves an Amiga and an Alpha withoutproblems aside from the occasional "nfs_reply: ignoring error 55".

The previous NFS server was a Raspberry Pi also running NetBSD 8 whichserves m68k and PowerPC Macs, VAXen and various ARM SBCs, all runningNetBSD 8 or current.

The previous UltraSPARC machine was a Sun Fire v100 with 100 Mbps tlp*interfaces. Had no issues over the course of many months.

This machine has bge* interfaces, which could be buggy, but so do theAlpha and amd64 systems, and they've had no problems.

So it looks like it's an issue with running this system multiprocessor.But how would one diagnose this better when enabling LOCKDEBUG causes theproblem to go away?

I'm going to run it with tcpdump running on the NFS interfaces on bothends and see if it happens again. On the other hand, I can't imagine thisbeing an NFS issue that happens nowhere except multiprocessor UltraSPARC.I'm open to the possibility, though :)


Thanks,
John

Follow-Ups:
- Re: tstile in less than a day
  - From: John Klos

References:
- tstile in less than a day
  - From: John Klos
- Re: tstile in less than a day
  - From: Martin Husemann
- Re: tstile in less than a day
  - From: Sad Clouds
- Re: tstile in less than a day
  - From: Martin Husemann
- Re: tstile in less than a day
  - From: Sad Clouds
- Re: tstile in less than a day
  - From: Martin Husemann

Prev by Date: re: NetBSD-8.xx fails to boot on ultra 45
Next by Date: Re: tstile in less than a day
Previous by Thread: Re: tstile in less than a day
Next by Thread: Re: tstile in less than a day
Indexes:

Home | Main Index | Thread Index | Old Index