| To: | Trond Myklebust <trond.myklebust@xxxxxxxxxx> |
|---|---|
| Subject: | Re: [NFS] Long sleep with i_mutex in xfs_flush_device(), affects NFS service |
| From: | Stephane Doyon <sdoyon@xxxxxxxxx> |
| Date: | Tue, 26 Sep 2006 16:05:41 -0400 (EDT) |
| Cc: | xfs@xxxxxxxxxxx, nfs@xxxxxxxxxxxxxxxxxxxxx |
| In-reply-to: | <1159297579.5492.21.camel@lade.trondhjem.org> |
| References: | <Pine.LNX.4.64.0609191533240.25914@madrid.max-t.internal> <1159297579.5492.21.camel@lade.trondhjem.org> |
| Sender: | xfs-bounce@xxxxxxxxxxx |
On Tue, 26 Sep 2006, Trond Myklebust wrote: [...] [...]When the file system becomes nearly full, we eventually call down to xfs_flush_device(), which sleeps for 0.5seconds, waiting for xfssyncd to do some work. The linux NFS client typically sends bunches of 16 requests, and so if the client is writing a single file, some NFS requests are therefore delayed by up to 8seconds, which is kind of long for NFS. I mean there will be a delay on the server, in responding to the requests. Sorry for the confusion. When the NFS client does flush its cache, each request will take an extra 0.5s to execute on the server, and the i_mutex will prevent their parallel execution on the server. What's worse, when my linux NFS client writes out a file's pages, it does not react immediately on receiving a NOSPC error. It will remember and report the error later on close(), but it still tries and issues write requests for each page of the file. So even if there isn't a pileup on the i_mutex on the server, the NFS client still waits 0.5s for each 32K (typically) request. So on an NFS client on a gigabit network, on an already full filesystem, if I open and write a 10M file and close() it, it takes 2m40.083s for it to issue all the requests, get an NOSPC for each, and finally have my close() call return ENOSPC. That can stretch to several hours for gigabyte-sized files, which is how I noticed the problem. I suppose it has to try again at some point. Yet when flushing a file, if even one write requests gets an error response like ENOSPC, we know some part of the data has not been written on the server, and close() will return the appropriate error to the program on the client. If a single write error is enough to cause close() to return an error, why bother sending all the other write requests for that file? If we get an error while flushing, couldn't that one flushing operation bail out early? As I said I'm not too familiar with the code, but AFAICT nfs_wb_all() will keep flushing everything, and afterwards nfs_file_flush() wil check ctx->error. Perhaps ctx->error could be checked at some lower level, maybe in nfs_sync_inode_wait... I suppose it's not technically wrong to try to flush all the pages of the file, but if the server file system is full then it will be at its worse. Also if you happened to be on a slower link and have a big cache to flush, you're waiting around for very little gain. |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: [NFS] Long sleep with i_mutex in xfs_flush_device(), affects NFS service, Trond Myklebust |
|---|---|
| Next by Date: | Re: DMAPI problem on the cvs tree of (2.6.17) SGI-XFS CVS-2006-08-26, Paul Schutte |
| Previous by Thread: | Re: [NFS] Long sleep with i_mutex in xfs_flush_device(), affects NFS service, Trond Myklebust |
| Next by Thread: | Re: [NFS] Long sleep with i_mutex in xfs_flush_device(), affects NFS service, Trond Myklebust |
| Indexes: | [Date] [Thread] [Top] [All Lists] |