On Wed, Oct 10, 2012 at 10:51:42AM +0200, Marcin Deranek wrote:
> We are running XFS filesystem on one of out machines which is a big
> store (~3TB) of different data files (mostly images). Quite recently we
> experienced some performance problems - machine wasn't able to keep up
> with updates. After some investigation it turned out that open()
> syscalls (open for writing) were taking significantly more time than
> they should eg. 15-20ms vs 100-150us.
Which is clearly an IO latency vs cache hit latency.
> Some more info about our workload as I think it's important here:
> our XFS filesystem is exclusively used as data store, so we only
> read and write our data (we mostly write). When new update comes it's
> written to a temporary file eg.
> When file is completely stored we move it to final location eg.
> That means that we create lots of files in /mountpoint/some/path/.tmp
> directory, but directory is empty as they are moved (rename() syscall)
> shortly after file creation to a different directory on the same
> The workaround which I found so far is to remove that directory
> (/mountpoint/some/path/.tmp in our case) with its content and re-create
> it. After this operation open() syscall goes down to 100-150us again.
> Is this a known problem ?
By emptying the directory, you are making it smaller and likely
causing it to be cached in memory again as new files are added to
it. Over time, blocks will be removed from the cache due to memory
pressure, and latencies will be seen again.
> Information regarding our system:
> CentOS 5.8 / kernel 2.6.18-308.el5 / kmod-xfs-0.4-2
Use a more recent distro. I reworked the metadata caching algorithms
a couple of years ago to avoid these sorts of problems with memory