On Mon, Jun 18, 2007 at 08:35:45PM -0400, Mark Seger wrote:
> Thanks for the reply. I guess my general question is that if this is
> indeed a memory issue, wouldn't you agree that it's a bug if the server
> essentially becomes incapable of servicing data?
Yup.
> Maybe I wasn't clear
> that as long as any of the clients are trying to do reads, the cpu
> essentially locks up at 25% utilization across 4 cpus. It's not until I
> kill all the readers that the server returns to normal.
All nfsds reading a single directory - there's a single sleeping
lock for the directory that they are contending on (i_mutex). Hence
one CPU busy maximum.
> >Sounds like you are running out of memory to cache the workload in.
> >The readdir load indicates that you are probably running out of
> >dentry/inode
> >cache space, and so every lookup is having to re-read the inodes
> >from disk. i.e. readdir and stat are necessary.
> >
> I hear what you're saying, but why then isn't the original stat slower?
Because memory reclaim can effectively put random holes the cache.
Hence the second read becomes a random I/O workload instead of a
more sequential workload where readahead can hide most latencies.
> After creating the 1M+ files I can umount/mount the file system or
> simply reboot the server, assuring nothing is cached and can either stat
> or read all the files in about 15 minutes so why would rereading inodes
> from disk happen at a such a slow rate.
Because reading into empty memory in a sequential manner is much
faster than filling random holes in an already full cache that
may be thrashing....
> >I'd suggest looking at /proc/slabinfo (slabtop helps here) and
> >/proc/meminfo to determine how much of your working set of inodes
> >are being held in cache and how quickly they are being recycled.
> >
> one of the things I do monitor is memory and slab info and can even send
> you a detailed trace on a per slab basis. are there any specific slabs
> I should be looking at?
# egrep [xrdb][fanu][sdnf][i_f] /proc/slabinfo
> >perhaps fiddling with /proc/sys/vm/vfs_cache_pressure will help
> >keep inodes/dentryies in memory over page cache pages...
> >
> any suggestions for settings?
Whatever is suggested in Documentation/filesystems/proc.txt for keeping
1.5-2x more dentries/inodes around under memory pressure.
Also, when it is thrashing, can you try these combinations:
# echo 1 > /proc/sys/vm/drop_caches
# echo 2 > /proc/sys/vm/drop_caches
# echo 3 > /proc/sys/vm/drop_caches
And see if any of them improve the throughput....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
|