[Top] [All Lists]

Re: Still seeing hangs in xlog_grant_log_space

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Still seeing hangs in xlog_grant_log_space
From: Mark Tinguely <tinguely@xxxxxxx>
Date: Mon, 11 Jun 2012 15:59:05 -0500
Cc: Peter Watkins <treestem@xxxxxxxxx>, Juerg Haefliger <juergh@xxxxxxxxx>, bpm@xxxxxxx, xfs@xxxxxxxxxxx
In-reply-to: <20120605235447.GF22848@dastard>
References: <CAH4wwdGWHSZoveLJMxu5pjr22NEEeW7oG8TS+snoM8RY=ZeRmg@xxxxxxxxxxxxxx> <CADLDEKsGtsw-rrSOE7gY4T81u+p41b34ixv0B7Dh07afJ73n2w@xxxxxxxxxxxxxx> <CAH4wwdFu7DEkHFZ5Bf7_PtLPsG0hUyUDoov03q=82R6t+QkERg@xxxxxxxxxxxxxx> <20120605235447.GF22848@dastard>
User-agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120122 Thunderbird/9.0
On 06/05/12 18:54, Dave Chinner wrote:

Reading bug #922 I see your test case reproduces in recent kernels, so
there must be a newer problem also.

Right, that's what we need to find - it appears to be a CIL
stall/accounting leak, completely unrelated to all the other AIL/log
space stalls that have been occurring. Last thing is that I was
waiting for more information on the stall that mark T @ sgi was able
to reproduce. I haven't heard anything from him since I asked for
more information on May 23....




I am using the test instructions/programs in the above bug report

 1) Linux 3.5rc1
 2) temporary band-aid of performing a xfs_log_force() before the
    xfs_fs_log_dummy() in the xfs_sync_worker().
  a) Even with a xfs_log_force(), it is still possible to hang the sync
  b) or replacing the band-aid with Brian Foster's "xfs: check for stale
     inode before acquiring iflock on push" patch also resulted in a
     quick hard hang.
     i) side note, printk routines in Linux 3.5rc1 has a "struct log"
       item that crash wants to use instead of XFS's "struct log". I
 3) small log (576K)
  a) size of the log in important. The smaller the log, the easier it
     is to hang. 2+MB logs are much harder to hang.
 4) perl program that has multiple workers doing cp/rm.

Sorry Dave, I did not realize you were waiting for more information from me. I thought the fixing the sync worker was more important.
I also was hoping empty AIL hang was a result of the band-aid
xfs_log_force() and not a second problem.

I will use the above to try to recreate and core the hang on Linux 3.5rc1 where the AIL is empty.


--Mark Tinguely.

<Prev in Thread] Current Thread [Next in Thread>