xfs-masters
[Top] [All Lists]

[xfs-masters] Re: Problems reading >1M files from same directory with nf

To: David Chinner <dgc@xxxxxxx>
Subject: [xfs-masters] Re: Problems reading >1M files from same directory with nfs
From: Mark Seger <Mark.Seger@xxxxxx>
Date: Mon, 18 Jun 2007 20:35:45 -0400
Cc: xfs-masters@xxxxxxxxxxx, linux-xfs@xxxxxxxxxxx, Hank Jakiela <Hank.Jakiela@xxxxxx>
In-reply-to: <20070618232559.GF85884050@xxxxxxx>
References: <4676CFF9.8090805@xxxxxx> <20070618232559.GF85884050@xxxxxxx>
Reply-to: xfs-masters@xxxxxxxxxxx
Sender: xfs-masters-bounce@xxxxxxxxxxx
User-agent: Thunderbird 1.5.0.12 (Windows/20070509)
Thanks for the reply.  I guess my general question is that if this is 
indeed a memory issue, wouldn't you agree that it's a bug if the server 
essentially becomes incapable of servicing data?  Maybe I wasn't clear 
that as long as any of the clients are trying to do reads, the cpu 
essentially locks up at 25% utilization across 4 cpus.  It's not until I 
kill all the readers that the server returns to normal.

anyhow, I have some additional comments/questions below.

David Chinner wrote:
> On Mon, Jun 18, 2007 at 02:33:29PM -0400, Mark Seger wrote:
>   
>> First of all I wasn't sure who to send this to so I copied all the 
>> addresses in the 'maintainers' document...
>>     
>
> Just the mailing lists are sufficient.
>
>   
>> There appears to be a critical problem reading more than 1M xfs files in 
>> the same directory using nfs.  Shortly after beginning the reads, the 
>> CPU load hits a fairly steady 25% system load and the nfs read also rate 
>> drops well below 100 read/sec and eventually falls below  10.  I'm doing 
>> these tests with 4096 byte files on a two-socket dual-core operteron 
>> with 8GB ram.  See the attachment named 'hosed.txt' which shows what the 
>> cpu, disk, network and nfs were doing during the timeframe that thing 
>> went bad.
>>     
>
> Sounds like you are running out of memory to cache the workload in.
> The readdir load indicates that you are probably running out of dentry/inode
> cache space, and so every lookup is having to re-read the inodes
> from disk. i.e. readdir and stat are necessary.
>   
I hear what you're saying, but why then isn't the original stat slower?  
After creating the 1M+ files I can umount/mount the file system or 
simply reboot the server, assuring nothing is cached and can either stat 
or read all the files in about 15 minutes so why would rereading inodes 
from disk happen at a such a slow rate.  At one point I just let my 
'slow reads' continue and they went on for over 24 hours, slowing to 
less than 10/sec and never did finish.
> I'd suggest looking at /proc/slabinfo (slabtop helps here) and
> /proc/meminfo to determine how much of your working set of inodes
> are being held in cache and how quickly they are being recycled.
>   
one of the things I do monitor is memory and slab info and can even send 
you a detailed trace on a per slab basis.  are there any specific slabs 
I should be looking at?
> perhaps fiddling with /proc/sys/vm/vfs_cache_pressure will help
> keep inodes/dentryies in memory over page cache pages...
>   
any suggestions for settings?
-mark
> Cheers,
>
> Dave.
>   


<Prev in Thread] Current Thread [Next in Thread>