[PATCH 59/60] xfs: Add xfs_log_rlimit.c
Mark Tinguely
tinguely at sgi.com
Tue Jun 25 09:06:39 CDT 2013
On 06/24/13 17:27, Dave Chinner wrote:
> On Mon, Jun 24, 2013 at 04:26:28PM -0500, Mark Tinguely wrote:
>> On 06/18/13 23:51, Dave Chinner wrote:
>>> + * 2) If the lsunit option is specified, a transaction requires 2 LSU
>>> + * for the reservation because there are two log writes that can
>>> + * require padding - the transaction data and the commit record which
>>> + * are written separately and both can require padding to the LSU.
>>> + * Consider that we can have an active CIL reservation holding 2*LSU,
>>> + * but the CIL is not over a push threshold, in this case, if we
>>> + * don't have enough log space for at one new transaction, which
>>> + * includes another 2*LSU in the reservation, we will run into dead
>>> + * loop situation in log space grant procedure. i.e.
>>> + * xlog_grant_head_wait().
>>> + *
>>> + * Hence the log size needs to be able to contain two maximally sized
>>> + * and padded transactions, which is (2 * (2 * LSU + maxlres)).
>>> + *
>>
>> Any thoughts on how we can separate the 2 * log stripe unit from the
>> reservation.
>
> You can't. The reservation, by definition, is the worse case log
> space usage for the transaction. Therefore, it has to take into
> account the LSU padding that may be necessary if this transaction is
> committed by itself to the log (e.g. wsync operation)
I am thinking we should separate the extra 2*LSU allocated space from
the individual pieces of the transaction (caused by xfs_trans_rolls and
xfs_bmap_finish) and associate it with the transaction. Sync writes are
for the transaction anyway and not the pieces. It is the multiplication
of the (2*LSU) by each piece of the transaction that is the killer.
A mkdir will have 1.5MB of log reserved space just for the possible LSU
padding.
The cil can steal the left over current ticket space and the global LSU
space.
>
>> The added extended attribute calls for parent inode pointers
>> (especially xfs_rename() where it could add up to one and remove up
>> to two attributes) is causing a huge multiplication cnt for
>> reservation.
>
> So you are adding a attribute reservations to
> create/mkdir/rename/link transaction reservations? That doesn't seem
> like a good idea, and it's contrary to the direction we want to head
> with the transaction subsystem. Can you post your patches so we
> some some context to the question you are asking?
>
> FWIW, the intended method of linking multiple independent
> transactions together for parent pointers into an atomic commit is
> this:
>
> http://xfs.org/index.php/Improving_Metadata_Performance_By_Reducing_Journal_Overhead#Atomic_Multi-Transaction_Operations
>
> While the text is somewhat out of date (talks about rolling
> transactions and being able to replace them), it predated the
> delayed logging implementation and hence doesn't take into account
> the obvious, simple extension that can be made to the CIL to
> implement this. i.e. being able to hold the current CIL context
> open for a number of transactions to take place before releasing it
> again.
>
> The point of doing this is still relevant, however:
>
> "This may also allow us to split some of the large compound
> transactions into smaller, more self contained transactions. This
> would reduce reservation pressure on log space in the common case
> where all the corner cases in the transactions are not taken."
>
> IOWs, rather than building new, large "aggregation" transactions, we
> should be adding infrastructure to the CIL to allow atomic
> transaction aggregation techniques to be used and then just using
> the existing transaction infrastructure to implement operations such
> as "create with ACL"....
I will have to think on this. I don't see how the allocation of log
space for a new transaction while holding locks is a good thing.
>
> FYI, we need this for all the operations that add attributes at
> create time, such as default ACLs and DMAPI/DMF attributes....
>
>> Those multiplications would be killers on 256KiB log
>> stripe units.
>
> Numbers, please?
>
> Cheers,
>
> Dave.
/*
* Various log count values.
*/
#define XFS_DEFAULT_LOG_COUNT 1
#define XFS_DEFAULT_PERM_LOG_COUNT 2
#define XFS_ITRUNCATE_LOG_COUNT 2
#define XFS_INACTIVE_LOG_COUNT 2
#define XFS_CREATE_LOG_COUNT 2
#define XFS_MKDIR_LOG_COUNT 3
#define XFS_SYMLINK_LOG_COUNT 3
#define XFS_REMOVE_LOG_COUNT 2
#define XFS_LINK_LOG_COUNT 2
#define XFS_RENAME_LOG_COUNT 2
#define XFS_WRITE_LOG_COUNT 2
#define XFS_ADDAFORK_LOG_COUNT 2
#define XFS_ATTRINVAL_LOG_COUNT 1
#define XFS_ATTRSET_LOG_COUNT 3
#define XFS_ATTRRM_LOG_COUNT 3
Even if you are creative, the multipliers add up quickly.
xfs_rename will do the directory ops. possibly a attibset and up to 2
attrrm and we have to hold the src and target inodes locks over all the
operations.
Thanks for the ideas.
--Mark.
More information about the xfs
mailing list