Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NFS client renders system unusable



Brad Pitney wrote:
On Sat, Mar 15, 2008 at 5:39 AM, Sarton O'Brien 
<bsd-xen%roguewrt.org@localhost> wrote:
Sarton O'Brien wrote:
 > Sarton O'Brien wrote:
 >> I have an NFS server running with samba on lfs which occaisionally
 >> panics for reasons I haven't bothered to delve into yet, I'm almost
 >> positive it is either samba or lfs related ... but most likely the
 >> combination.
 >>
 >> When the NFS server dies, the client console fills up with:
 >>
 >> yp_order: clnt_call: RPC: Unable to send; errno = No buffer space
 >> available
 >> yp_order: clnt_call: RPC: Unable to send; errno = No buffer space
 >> available
 >> yp_order: clnt_call: RPC: Unable to send; errno = No buffer space
 >> available
 >> yp_order: clnt_call: RPC: Unable to send; errno = No buffer space
 >> available
 >> yp_order: clnt_call: RPC: Unable to send; errno = No buffer space
 >> available
 >> yp_order: clnt_call: RPC: Unable to send; errno = No buffer space
 >> available
 >>
 >> And the client system becomes completely unusable.
 >>
 >> In this instance it occurred within minutes of the server dying.
 >>
 >> Googling the error doesn't reveal much so I'm not sure if there is
 >> something I can do to mitigate the effects, if it's a known issue or
 >> if it's a bug. As the NFS server used to be almost completely stable,
 >> this is a problem I've never really encountered.
 >

OK, now it just appears to be happening when under load, the NFS
 > server hasn't died and the client is producing those messages endless ...
 >
 > kernel and userland are only a few days old for both server and client.
 >
 > Any ideas?

 Sorry, I should have mentioned that the load is client connections, not
 data. I've a p2p app on the client using minimal BW but supporting a
 large amount of connections ... all those p2p connections would equate
 to nfs client conections. The problem only occurs when the app is
 running. All other clients, including ones with high data usage are fine.

 I can't provide any details from the client due to it being unusable but
 nfsstat on the server reports:

 spike# nfsstat -s
 Server Info:
 RPC Counts: (75842 calls)
      null         getattr         setattr          lookup          access
         0  0%       25052 33%         760  1%       28762 37%
 6518  8%
  readlink            read           write          create           mkdir
         0  0%        6926  9%        3797  5%         764  1%
 0  0%
   symlink           mknod          remove           rmdir          rename
         0  0%           0  0%        1102  1%           0  0%
 359  0%
      link         readdir     readdirplus          fsstat          fsinfo
       493  0%         671  0%           0  0%          49  0%
 5  0%
  pathconf          commit
         0  0%         584  0%
 Server Errors:
 RPC errors          faults
      2321               0
 Server Cache Stats:
 inprogress            idem        non-idem          misses
         0               0               0           75842
 Server Write Gathering:
    writes       write RPC       OPs saved
      3797            3797               0  0%

 Though I'm guessing the problem is with the client ... buffer? I have no
 idea ...


Strange, when my NFS server goes away (maybe I've updated it and
rebooted), my NFS clients just stall and wait for the server to come
back, once it does, just resume as if nothing happened with exception
of a few messages like this:
Mar 13 00:13:44 nfs-client /netbsd: nfs server
nfs-server:/media/data/netbsd/current/obj: not responding
Mar 13 00:13:44 nfs-client /netbsd: nfs server
nfs-server:/media/data/netbsd/current/obj: is alive again
Mar 13 13:37:48 nfs-client /netbsd: nfs server
nfs-server:/media/data/netbsd/pkgsrc: not responding
Mar 13 13:37:48 nfs-client /netbsd: nfs server
nfs-server:/media/data/netbsd/pkgsrc: is alive again
Mar 14 21:13:33 nfs-client /netbsd: nfs server
nfs-server:/media/data/netbsd/current/obj: not responding
Mar 14 21:14:49 nfs-client /netbsd: nfs server
nfs-server:/media/data/netbsd/current/obj: is alive again

On a typical client, not hosting p2p, that's what I'd see also.

is this with DomUs? if so, is it just the one of them?

Yes it is and yes it is only the one but it's also the only one hosting p2p with an nfs backend. If I stop the p2p process then this won't occur.

what does netstat -m show? are you using NIS/YP?

I wish I could netstat ... I wasn't exagerating on the subject. I have no console access at all ... well, it's either unresponsive or just streaming with that message. I tried blindly typing but it didn't seem to do anything.

Yes I am using NIS/YP, which could definately effect it ... but even ssh lacks a login prompt. It seems to be i/o resource related, I'm thinking the sysctl options suggestion might help if I can figure out which one.

is it possible for you to try FFS as opposed to LFS?

The nfs exports or on ffs. It's just samba using lfs. The crashes seem to occur when certain actions are performed via samba. Not sure how bad it is at the moment but sometimes it just takes a couple of renames to trigger. Haven't had time to check it out properly yet.

Either way, this shouldn't be effectng the client to the point of it being unusable, I'd be happy for it to cull some processes or connections rather than lock up. Hopefully I can increase the required resources sufficiently to at least accommodate the unusual demand the p2p app seems to be applying to the nfs client.


Sarton


Home | Main Index | Thread Index | Old Index