On Mon, Jan 07, 2013 at 12:03:21PM -0600, Mark Tinguely wrote:
> On 01/05/13 18:08, Dave Chinner wrote:
> >On Sat, Jan 05, 2013 at 02:34:15PM -0600, Mark Tinguely wrote:
> >>The back-end of xlog_cil_push() allows multiple push sequences
> >>to write to the xlog at the same time.
> >It does this by design, and has since day zero.
> >>This will cause problems
> >>for recovery and also could cause the xlog_cil_committed() callback
> >>to be called out of sequence.
> >Log recovery is supposed to be able to handle it just fine in that
> >recovery only replays up to the last checkpoint with a valid commit
> >record. Checkpoints that don't have valid commit records - no matter
> >the order they are written - will terminate recovery at the LSN of
> >the lowest entire commit.
> >>The xlog_cil_committed() callback misorder happens because the buffer that
> >>contains the sequence ticket is filled by another sequence push and the
> >>callback for the buffer write happens before the callback is placed onto
> >>that buffer.
> >I'm not sure I follow you here. xfs_log_done() takes a reference to
> >the iclog that the commit record is added to, and I/O cannot be
> >issued on that iclog until the reference count drops to zero. Hence
> >the sequence of writing the commit record, obtaining the commit_lsn,
> >adding the callbacks to the iclog and releasing the iclog are atomic
> >from an I/O perspective, and IO is only issued when the reference
> >count falls to zero.
> >And given that xlog_write() uses the same reference counting to
> >provide the same guarantees, I cannot see how concurrent in-memory
> >writes to the same iclog could cause IO completion callbacks to be
> >issued out of order.
> I will look again at the ic_refcnt but it the callback from cil push
> sequence #2 has jumped in front of cil push #1.
Look at the order of the callbacks on the iclog. iclog completion is
done in iclog ring order (see xlog_iodone/xlog_state_done_syncing/
xlog_state_do_callback), and we shouldn't be getting commit records
out of order into the iclogs as we serialise writing them. Hence we
should have with both in-order writing and in-order completion, and
AFAICT, the callbacks on an iclog should be run oldest to newest so
should also be in order.
If you can track down where the callbacks are getting out of order,
or find some way to reproduce such behaviour, I'm all ears because
it should not be happening.