On 9/26/2013 9:23 AM, Eric Sandeen wrote:
> On 9/26/13 8:30 AM, Ronnie Tartar wrote:
>> Stan, looks like I have directory fragmentation problem.
>>
>> xfs_db> frag -d
>> actual 65057, ideal 4680, fragmentation factor 92.81%
>>
>> What is the best way to fix this?
>
> http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25._Is_that_bad.3F
>
> We should just get rid of that command, TBH.
>
> So your dirs are in an average of 65057/4680 or about 14 fragments each.
> Really not that bad, in the scope of things.
>
> I'd imagine that this could be more of your problem:
>
>> The
>> folders are image folders that have anywhere between 5 to 10 million images
>> in each folder.
>
> at 10 million entries in a dir, you're going to start slowing down on inserts
> due to btree management. But that probably doesn't account for multiple
> seconds for
> a single file.
>
> So really,it's not clear *what* is slow.
>
>> It takes about 2.5 to 3.5 seconds to write a single file.
>
> strace with timing would be a very basic way to get a sense of what is slow;
> is it the file open/create? How big is the file, are you doing buffered or
> direct IO?
>
> On a more modern OS you could do some of the tracing suggested in
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> but some sort of profiling (oprofile, perhaps) might tell you where time is
> being spent in the kernel.
>
> When you say suddenly started, was it after a kernel upgrade or other change?
Eric is an expert on this, much more knowledgeable than me. And somehow
I missed the 5-10 million files per dir. Maybe you have multiple issues
here adding up to large delays. In addition to the steps Eric
recommends, it can't hurt to go ahead and take a look at the free space
map. Depending on how the filesystem has aged this could be a factor,
such as being 90%+ full at one time, and then lots of files being deleted.
# xfs_db -r -c freesp /dev/[device]
--
Stan
|