https://bugzilla.kernel.org/show_bug.cgi?id=16348
--- Comment #13 from Dave Chinner <david@xxxxxxxxxxxxx> 2010-07-13 11:10:36 ---
(In reply to comment #12)
> meta-data=/dev/sdc isize=256 agcount=2048, agsize=1668912
> blks
> meta-data=/dev/sdb isize=256 agcount=2048, agsize=1668912
> blks
> meta-data=/dev/sdd isize=256 agcount=2048, agsize=1668912
> blks
> meta-data=/dev/sde isize=256 agcount=2048, agsize=1668912
> blks
> meta-data=/dev/sdg isize=256 agcount=13, agsize=268435392
> blks
> meta-data=/dev/sdf isize=256 agcount=26, agsize=268435392
> blks
It's what I was expecting. The old filesystems were configured strangely with a
lot (2048) of relatively small allocation groups. The newer 13TB and 26TB
filesystems are using 1TB allocation groups, so are a lot more sensible in
configuration.
Basically the problem is that every iteration of the xfs inode shrinker will be
doing a xfs_perag_get/put on every AG in every filesystem. That means it may be
scaning >8000 AGs before finding an inode to reclaim. This is where teh CPU is
being used.
Per-filesystem shrinker callouts will help this, as will a variable reclaim
start AG. However, the overhead of aggregating across all those AGs won't go
away. I think I have a solution to that (propagate the reclaimable inode bit to
the per-ag radix tree) that will reduce the overhead of aggregation and reclaim
scanning quite a bit, however it is onl an idea right now. Let me think on this
a bit....
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
|