[PATCH 3/4] xfs: add quota-driven speculative preallocation throttling
Brian Foster
bfoster at redhat.com
Fri Dec 14 10:13:00 CST 2012
On 12/12/2012 09:25 PM, Dave Chinner wrote:
> On Wed, Dec 05, 2012 at 11:47:57AM -0500, Brian Foster wrote:
...
>>
>> Signed-off-by: Brian Foster <bfoster at redhat.com>
>
> I'm having trouble determining what the algorithm is supposed to be
> and what might be bugs in the algorithm....
>
> These are notes, somewhat incoherent, but I'll post them anyway
> because i think they convey my concerns and solutions well enough.
>
Ok...
> Cheers,
>
> Dave.
>
> ----
>
> Describe the algorithm to ensure I have it right:
>
> calculate preallocation size (alloc_blocks)
> uses file size to determine size
>
> determine ENOSPC throttling reduction (shift)
>
> calculate maximum quota prealloc amount allowed
> walk each dquot on inode
> check hard limit
> => over = no prealloc
> check soft limit
> => none = prealloc unmodified
> check over soft limit
> => no = prealloc unmodified
> calculate "throttle percentage"
> calculate max prealloc
> use minimum prealloc value returned
>
> alloc_blocks = MIN(alloc_blocks, quota_alloc_blocks)
>
> apply ENOSPC throttle (shift)
>
>
> - prealloc size being overridden by quota throttling, and then the
> ENOSPC throttle is applied to that.
>
That sounds accurate to me.
> - the overall algorithm looks good to me, that means my problems are
> with the implementation....
>
> - it calculates stuff dynamically that could be set once in
> the struct xfs_dquot on initialisation and whenever limits
> are changed. i.e. the "percentage to throttle". This
> should be carried by xfs_dquot as it simplifies
> the logic here.
>
Indeed.
> - if there is a hard limit but no soft limit, it *always*
> throttles preallocation to the default percentage even
> where there is lots of space available for the full
> prealloc.
>
I'm not following you here. xfs_prealloc_dquot_max() should return the
maximum allowed preallocation for a quota by calculating some percentage
of the free space in the quota (regardless of the existence of a soft
limit; that serves as a throttling watermark and percentage modifier).
xfs_prealloc_quota_max() just extends that across all quota types and
returns the most limiting value. For an inode with plenty of space in
its quota set, we should get a large value back for max_quota_prealloc,
which is then ignored because alloc_blocks doesn't violate that limit.
(Regardless, I don't think this changes your design comments that follow.)
> - it returns a block count based on limits, or -1 for no
> throttling. I'd prefer a pair of functions - one to check
> whether throttling is needed, and one to calculate the
> throttling parameters
>
> - should use xfs_this_quota_on() to drive quota
> checks completely inside throttle check. will make
> the code much cleaner
>
> - check function can be boolean
>
> - calc function should return both raw space
> available and shift values so the global prealloc
> values can be overridden independently. i.e.
> allows quota throttling to work even when overall
> prealloc is less than the maximum quota would
> allow.
>
>
> Rough code:
>
> need_throttle()
> {
> if (!xfs_this_quota_on(mp, type))
> return false;
>
> dq = xfs_inode_dquot(ip, type);
>
> /* no hard limit, no throttle */
> if (!dq->q_hard_limit)
> return false;
>
> /* over hard limit, always throttle */
> if (dq->q_res_bcount > dq->q_hard_limit)
> return true;
>
> /*
> * Under soft limit, no throttle.
> *
> * Note: we always have a soft limit for prealloc,
> * calculated at dquot instantiation or limit change
> */
> if (dq->q_res_bcount + alloc_blocks < dq->q_soft_limit)
> return false;
>
> /* between soft limit and hard limit, need to throttle */
> return true;
> }
>
> - needs struct xfs-dquot to be initialised appropriately and quota
> limit changes to handle changes correctly.
>
> - allows soft limit defaults to be set in memory if they aren't on
> disk. i.e. default throttling values will be no different in
> implementation to on-disk limits.
>
This comment is not quite clear to me. I suspect it relates to the
pseudocode comment above with regard to a default soft limit. To this
point, I'm interpreting your design proposal as something like the
following, at a high-level:
- xfs_dquot grows a (in memory) table of low space limits. This table is
populated in when the quota is read from disk, created and/or the input
parameters (hard/soft limit) change.
- The soft limit as a percentage modifier goes away by nature of using
the 3-5% logarithmic throttle (shift).
- The soft limit as a prealloc throttling trigger remains, just
implemented as a separate function as depicted above.
There are no on-disk changes proposed, correct? Are you suggesting we
set a default soft limit value on all quotas with a hard limit?
> calc_throttle()
> {
> dq = xfs_inode_dquot(ip, type);
>
> freesp = dq->q_hard_limit - dq->q_res_bcount;
>
> if (freesp < dq->q_low_space[XFS_LOWSP_5_PCNT]) {
> shift = 2;
> if (freesp < dq->q_low_space[XFS_LOWSP_4_PCNT])
> shift++;
> if (freesp < dq->q_low_space[XFS_LOWSP_3_PCNT])
> shift++;
> if (freesp < dq->q_low_space[XFS_LOWSP_2_PCNT])
> shift++;
> if (freesp < dq->q_low_space[XFS_LOWSP_1_PCNT])
> shift++;
> }
>
> /* only overwrite current values if the result is a smaller prealloc */
> if ((freesp >> shift) >= (*qblocks >> *qshift))
> return;
>
> *qblocks = freesp;
> *qshift = shift;
> }
>
> - similar shift table to the ENOSPC code for a logarithmic mapping
> rather than a linear mapping.
> - probably doesn't need 5 steps, 3 steps that do shift += 2 is
> probably sufficient and would reduce per-dquot memory overhead.
>
> xfs_iomap_prealloc_size()
> {
> .....
>
> qblocks = alloc_blocks;
> qshift = 0;
>
> if (need_throttle(ip, XFS_DQ_USER, alloc_blocks)
> calc_throttle(ip, XFS_DQ_USER, &qblocks, &qshift);
>
> if (need_throttle(ip, XFS_DQ_GROUP, alloc_blocks)
> calc_throttle(ip, XFS_DQ_GROUP, &qblocks, &qshift);
>
> if (need_throttle(ip, XFS_DQ_PROJ, alloc_blocks)
> calc_throttle(ip, XFS_DQ_PROJ, &qblocks, &qshift);
>
> /*
> * The final size of the preallocation is the smaller of the
> * whole filesystem prealloc size and the quota prealloc
> * size. i.e. whichever entity has the least space available
> * for allocation determines the maximum preallocation size.
> *
> * The final throttling level is the larger of the ENOSPC
> * and quota throttles. i.e. which ever is closer to their
> * respective space limit determines how much we throttle
> * by.
> */
> alloc_blocks = MIN(qblocks, alloc_blocks)
> shift = MAX(qshift, shift)
> ....
>
... and this all makes sense.
I'm a little unsure about applying a shift selected for one limit
against the preallocation size of another, but I suppose it can't hurt
to be aggressive. This design should bubble up all the relevant
parameters to a single point anyways, so that should be easier to reason
about and measure when I have some code.
Thanks for the review Dave!
Brian
More information about the xfs
mailing list