[Top] [All Lists]

Re: [PATCH 59/60] xfs: Add xfs_log_rlimit.c

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 59/60] xfs: Add xfs_log_rlimit.c
From: Mark Tinguely <tinguely@xxxxxxx>
Date: Tue, 25 Jun 2013 09:06:39 -0500
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20130624222708.GV29338@dastard>
References: <1371617468-32559-1-git-send-email-david@xxxxxxxxxxxxx> <1371617468-32559-60-git-send-email-david@xxxxxxxxxxxxx> <51C8B984.3090400@xxxxxxx> <20130624222708.GV29338@dastard>
User-agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120122 Thunderbird/9.0
On 06/24/13 17:27, Dave Chinner wrote:
On Mon, Jun 24, 2013 at 04:26:28PM -0500, Mark Tinguely wrote:
On 06/18/13 23:51, Dave Chinner wrote:
+        * 2) If the lsunit option is specified, a transaction requires 2 LSU
+        *    for the reservation because there are two log writes that can
+        *    require padding - the transaction data and the commit record which
+        *    are written separately and both can require padding to the LSU.
+        *    Consider that we can have an active CIL reservation holding 2*LSU,
+        *    but the CIL is not over a push threshold, in this case, if we
+        *    don't have enough log space for at one new transaction, which
+        *    includes another 2*LSU in the reservation, we will run into dead
+        *    loop situation in log space grant procedure. i.e.
+        *    xlog_grant_head_wait().
+        *
+        *    Hence the log size needs to be able to contain two maximally sized
+        *    and padded transactions, which is (2 * (2 * LSU + maxlres)).
+        *

Any thoughts on how we can separate the 2 * log stripe unit from the

You can't. The reservation, by definition, is the worse case log
space usage for the transaction. Therefore, it has to take into
account the LSU padding that may be necessary if this transaction is
committed by itself to the log (e.g. wsync operation)

I am thinking we should separate the extra 2*LSU allocated space from the individual pieces of the transaction (caused by xfs_trans_rolls and xfs_bmap_finish) and associate it with the transaction. Sync writes are for the transaction anyway and not the pieces. It is the multiplication of the (2*LSU) by each piece of the transaction that is the killer. A mkdir will have 1.5MB of log reserved space just for the possible LSU padding.

The cil can steal the left over current ticket space and the global LSU space.

The added extended attribute calls for parent inode pointers
(especially xfs_rename() where it could add up to one and remove up
to two attributes) is causing a huge multiplication cnt for

So you are adding a attribute reservations to
create/mkdir/rename/link transaction reservations? That doesn't seem
like a good idea, and it's contrary to the direction we want to head
with the transaction subsystem.  Can you post your patches so we
some some context to the question you are asking?

FWIW, the intended method of linking multiple independent
transactions together for parent pointers into an atomic commit is


While the text is somewhat out of date (talks about rolling
transactions and being able to replace them), it predated the
delayed logging implementation and hence doesn't take into account
the obvious, simple extension that can be made to the CIL to
implement this. i.e.  being able to hold the current CIL context
open for a number of transactions to take place before releasing it

The point of doing this is still relevant, however:

"This may also allow us to split some of the large compound
transactions into smaller, more self contained transactions. This
would reduce reservation pressure on log space in the common case
where all the corner cases in the transactions are not taken."

IOWs, rather than building new, large "aggregation" transactions, we
should be adding infrastructure to the CIL to allow atomic
transaction aggregation techniques to be used and then just using
the existing transaction infrastructure to implement operations such
as "create with ACL"....

I will have to think on this. I don't see how the allocation of log space for a new transaction while holding locks is a good thing.

FYI, we need this for all the operations that add attributes at
create time, such as default ACLs and DMAPI/DMF attributes....

Those multiplications would be killers on 256KiB log
stripe units.

Numbers, please?



 * Various log count values.
#define XFS_DEFAULT_LOG_COUNT           1
#define XFS_ITRUNCATE_LOG_COUNT         2
#define XFS_INACTIVE_LOG_COUNT          2
#define XFS_CREATE_LOG_COUNT            2
#define XFS_MKDIR_LOG_COUNT             3
#define XFS_SYMLINK_LOG_COUNT           3
#define XFS_REMOVE_LOG_COUNT            2
#define XFS_LINK_LOG_COUNT              2
#define XFS_RENAME_LOG_COUNT            2
#define XFS_WRITE_LOG_COUNT             2
#define XFS_ADDAFORK_LOG_COUNT          2
#define XFS_ATTRINVAL_LOG_COUNT         1
#define XFS_ATTRSET_LOG_COUNT           3
#define XFS_ATTRRM_LOG_COUNT            3

Even if you are creative, the multipliers add up quickly.
xfs_rename will do the directory ops. possibly a attibset and up to 2 attrrm and we have to hold the src and target inodes locks over all the operations.

Thanks for the ideas.


<Prev in Thread] Current Thread [Next in Thread>