[Top] [All Lists]

Re: [PATCH 3/8] xfs: make the log ticket transaction id random

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH 3/8] xfs: make the log ticket transaction id random
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 6 Apr 2010 09:39:10 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20100403093156.GD20166@xxxxxxxxxxxxx>
References: <1270125691-29266-1-git-send-email-david@xxxxxxxxxxxxx> <1270125691-29266-4-git-send-email-david@xxxxxxxxxxxxx> <20100403093156.GD20166@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Sat, Apr 03, 2010 at 05:31:56AM -0400, Christoph Hellwig wrote:
> On Thu, Apr 01, 2010 at 11:41:26PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > The transaction ID that is written to the log for a transaction is
> > currently set by taking the lower 32 bits of the memory address of
> > the ticket structure.  This is not guaranteed to be unique as
> > tickets comes from a slab and slots can be reallocated immediately
> > after being freed. As a result, there is no guarantee of uniqueness
> > in the ticket ID value.
> > 
> > Fix this by assigning a random number to the ticket ID field so that
> > it is extremely unlikely that duplicates will occur and remove the
> > possibility of transactions being mixed up during recovery due to
> > duplicate IDs.
> I already noticed that you uses a random tid in your delayed logging
> patches.  But even a random number means we can get duplicate tids.

Agreed. However, random duplicates are much more unlikely, IMO, than
memory addresses of slab objects. The existing code is giving quite
regular duplicates according to the tracing I added, but I didn't
find any from the sampling I did with this patch....

> If we assign tids from a wrapping counter instead we can guarantee
> that they are unique as long as we don't have more than UINT_MAX
> transactions in the log, which is a limitation we could easily enforce.

Ideally, yes. However, I didn't want to introduce a global
monatomically increasing counter into the transaction code. I'm
seeing upwards of 50-60k transactions/s on my test box - that's
getting into the range where reliable global counters become
scalability limitations. I know this can be solved, but it is
somewhat complex and I'm not sure at this point that the complexity
is necessary. What do you think?


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>