[Top] [All Lists]

Re: Still seeing hangs in xlog_grant_log_space

To: Juerg Haefliger <juergh@xxxxxxxxx>
Subject: Re: Still seeing hangs in xlog_grant_log_space
From: Ben Myers <bpm@xxxxxxx>
Date: Mon, 21 May 2012 12:11:37 -0500
Cc: xfs@xxxxxxxxxxx
In-reply-to: <CADLDEKssiOCVRknW3hYtxDxYHSyGr6qfepfai+UymsD6zMGopw@xxxxxxxxxxxxxx>
References: <CADLDEKs6oMDA-6OhmcFxyRoBVpduKtSput=53TQGn9NCAOXC1Q@xxxxxxxxxxxxxx> <20120426230738.GB9541@dastard> <CADLDEKuKLeYiqhQW0E9g_bS0VXoxPGPOck3N004Pxg4_Opbzow@xxxxxxxxxxxxxx> <20120427110922.GF9541@dastard> <CADLDEKtUHAGcOPT1jtcvyJVk+zsoL5_thYFtHJYs+w=6EGuVSA@xxxxxxxxxxxxxx> <CADLDEKs4YbNzj2c0HKHwSdUfKy0efdQRe1rOsWDkWUgd+BOGHw@xxxxxxxxxxxxxx> <20120507171908.GA16881@xxxxxxx> <CADLDEKvgT_FcGhJKoPaQv0mh_Jqdaqu8SYatc9xxU7vOY217YQ@xxxxxxxxxxxxxx> <20120518171959.GQ16099@xxxxxxx> <CADLDEKssiOCVRknW3hYtxDxYHSyGr6qfepfai+UymsD6zMGopw@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
Hey Juerg,

On Sat, May 19, 2012 at 09:28:55AM +0200, Juerg Haefliger wrote:
> > On Wed, May 09, 2012 at 09:54:08AM +0200, Juerg Haefliger wrote:
> >> > On Sat, May 05, 2012 at 09:44:35AM +0200, Juerg Haefliger wrote:
> >> >> Did anybody have a chance to look at the data?
> >> >
> >> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
> >> >
> >> > Here you indicate that you have created a reproducer.  Can you post it 
> >> > to the list?
> >>
> >> Canonical attached them to the bug report that they filed yesterday:
> >> http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
> >
> > I'm interested in understanding to what extent the hang you see in 
> > production
> > on 2.6.38 is similar to the hang of the reproducer.  Mark is seeing a 
> > situation
> > where there is nothing on the AIL and is clogged up in the CIL, others are
> > seeing items on the AIL that don't seem to be making progress.  Could you
> > provide a dump or traces from a hang on a filesystem with a normal sized 
> > log?
> > Can the reproducer hit the hang eventually without resorting to the tiny 
> > log?
> I'm not certain that the reproducer hang is identical to the
> production hang. One difference that I've noticed is that a reproducer
> hang can be cleared with an emergency sync while a production hang
> can't. I'm working on trying to get a trace from a production machine.

Hit this on a filesystem with a regular sized log over the weekend.  If you see
this again in production could you gather up task states?

echo t > /proc/sysrq-trigger

Mark and I have been looking at the dump.  There are few interesting items to 
point out.

1) xfs_sync_worker is blocked trying to get log reservation:

PID: 25374  TASK: ffff88013481c6c0  CPU: 3   COMMAND: "kworker/3:83"
 #0 [ffff88013481fb50] __schedule at ffffffff813aacac
 #1 [ffff88013481fc98] schedule at ffffffff813ab0c4
 #2 [ffff88013481fca8] xlog_grant_head_wait at ffffffffa0347b78 [xfs]
 #3 [ffff88013481fcf8] xlog_grant_head_check at ffffffffa03483e6 [xfs]
 #4 [ffff88013481fd38] xfs_log_reserve at ffffffffa034852c [xfs]
 #5 [ffff88013481fd88] xfs_trans_reserve at ffffffffa0344e64 [xfs]
 #6 [ffff88013481fdd8] xfs_fs_log_dummy at ffffffffa02ec138 [xfs]
 #7 [ffff88013481fdf8] xfs_sync_worker at ffffffffa02f7be4 [xfs]
 #8 [ffff88013481fe18] process_one_work at ffffffff8104c53b
 #9 [ffff88013481fe68] worker_thread at ffffffff8104f0e3
#10 [ffff88013481fee8] kthread at ffffffff8105395e
#11 [ffff88013481ff48] kernel_thread_helper at ffffffff813b3ae4

This means that it is not in a position to push the AIL.  It is clear that the
AIL has plenty of entries which can be pushed.

crash> xfs_ail 0xffff88022112b7c0,
struct xfs_ail {
  xa_ail = {
    next = 0xffff880144d1c318,
    prev = 0xffff880170a02078
  xa_target = 0x1f00003063,

Here's the first item on the AIL:

struct xfs_log_item_t {
  li_ail = {
    next = 0xffff880196ea0858,
    prev = 0xffff88022112b7d0
  li_lsn = 0x1f00001c63,                <--- less than xa_target
  li_desc = 0x0,
  li_mountp = 0xffff88016adee000,
  li_ailp = 0xffff88022112b7c0,
  li_type = 0x123b,
  li_flags = 0x1,
  li_bio_list = 0xffff88016afa5cb8,
  li_cb = 0xffffffffa034de00 <xfs_istale_done>,
  li_ops = 0xffffffffa035f620,
  li_cil = {
    next = 0xffff880144d1c368,
    prev = 0xffff880144d1c368
  li_lv = 0x0,
  li_seq = 0x3b

So if xfs_sync_worker were not blocked on log reservation it would push these

2) The CIL is waiting around too:

crash> xfs_cil_ctx 0xffff880144d1a9c0,
struct xfs_cil_ctx {
  space_used = 0x135f68, 

struct log {
  l_logsize = 0xa00000,

140000                                          <--- XLOG_CIL_SPACE_LIMIT

140000 - 135F68

Looks like xlog_cil_push_background will not push the CIL while space used is
less than XLOG_CIL_SPACE_LIMIT, so that's not going anywhere either.

3) It may be unrelated to this bug, but we do have a race in the log
reservation code that hasn't been resolved... between when log_space_left
samples the grant heads and when the space is actually granted a bit later.
Maybe we can grant more space than intended.

If you can provide output of 'echo t > /proc/sysrq-trigger' it may be enough
information to determine if you're seeing the same problem we hit on Saturday.


Ben & Mark

<Prev in Thread] Current Thread [Next in Thread>