Hi,
On Mon, Nov 28, 2011 at 03:17:36AM -0500, Christoph Hellwig wrote:
> Apply the scheme used in log_regrant_write_log_space to wake up any other
> threads waiting for log space before the newly added one to
> log_regrant_write_log_space as well, and factor the code into readable
> helpers. For each of the queues we have add two helpers:
>
> - one to try to wake up all waiting threads. This helper will also be
> usable by xfs_log_move_tail once we remove the current opportunistic
> wakeups in it.
> - one to sleep on t_wait until enough log space is available, loosely
> modelled after Linux waitqueues.
>
> And use them to reimplement the guts of log_regrant_write_log_space and
> log_regrant_write_log_space. These two function now use one and the same
> algorithm for waiting on log space instead of subtly different ones before,
> with an option to completely unify them in the near future.
>
> Also move the filesystem shutdown handling to the common caller given
> that we had to touch it anyway.
>
> Based on hard debugging and an earlier patch from
> Chandra Seetharaman <sekharan@xxxxxxxxxx>.
>
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
I'd like to make sure that I understand the race that Chandra
debugged and reported.
2499 STATIC int
2500 xlog_grant_log_space(xlog_t *log,
2501 xlog_ticket_t *tic)
2502 {
2503 int free_bytes;
2504 int need_bytes;
...
2517 /* something is already sleeping; insert new transaction at end */
2518 if (!list_empty_careful(&log->l_reserveq)) {
2519 spin_lock(&log->l_grant_reserve_lock);
2520 /* recheck the queue now we are locked */
2521 if (list_empty(&log->l_reserveq)) {
2522 spin_unlock(&log->l_grant_reserve_lock);
2523 goto redo;
2524 }
2525 list_add_tail(&tic->t_queue, &log->l_reserveq);
2526
2527 trace_xfs_log_grant_sleep1(log, tic);
2528
2529 /*
2530 * Gotta check this before going to sleep, while we're
2531 * holding the grant lock.
2532 */
2533 if (XLOG_FORCED_SHUTDOWN(log))
2534 goto error_return;
2535
2536 XFS_STATS_INC(xs_sleep_logspace);
2537 xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock);
2538
2539 /*
2540 * If we got an error, and the filesystem is shutting down,
2541 * we'll catch it down below. So just continue...
2542 */
2543 trace_xfs_log_grant_wake1(log, tic);
2544 }
2545
2546 redo:
2547 if (XLOG_FORCED_SHUTDOWN(log))
2548 goto error_return_unlocked;
2549
2550 free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
2551 if (free_bytes < need_bytes) {
2552 spin_lock(&log->l_grant_reserve_lock);
2553 if (list_empty(&tic->t_queue))
2554 list_add_tail(&tic->t_queue, &log->l_reserveq);
2555
2556 trace_xfs_log_grant_sleep2(log, tic);
2557
2558 if (XLOG_FORCED_SHUTDOWN(log))
2559 goto error_return;
2560
2561 xlog_grant_push_ail(log, need_bytes);
2562
2563 XFS_STATS_INC(xs_sleep_logspace);
2564 xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock);
2565
2566 trace_xfs_log_grant_wake2(log, tic);
2567 goto redo;
2568 }
2569
2570 if (!list_empty(&tic->t_queue)) {
2571 spin_lock(&log->l_grant_reserve_lock);
2572 list_del_init(&tic->t_queue);
2573 spin_unlock(&log->l_grant_reserve_lock);
2574 }
So the race that we're looking at here is:
process A was added to the reserve queue at either 2525 or 2554 and, pushes the
AIL at 2561,
xfsaild frees up enough log space for process A (possibly B?), eventually
xfs_log_move_tail is called to wake process A,
process A wakes at line 2564, and he is on the reserveq already,
process B sees that there are tickets on the queue at 2518 and gets the grant
reserve lock at 2519,
process A spins at 2571 waiting for the grant reserve lock,
process B adds itself to the queue at 2525,
process B drops the grant reserve lock and goes to sleep at 2537
process A takes the grant reserve lock at 2571 and removes it's ticket from the
list.
...and there is nothing to wake process B until the ail is pushed by
some other process.
Is that about right?
Thanks,
Ben
|