Crash with 3.8.3 and TuxOnIce

Pedro Ribeiro pedrib at gmail.com
Wed Mar 27 16:58:43 CDT 2013


Hi Dave (and others),

I've pretty much established the responsible: commit
437a255aa23766666aec78af63be4c253faa8d57
(
http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/releases/3.7.2/xfs-fix-direct-io-nested-transaction-deadlock.patch?id=HEAD
).

Without this patch, the computer does not lock up in hibernate. So I
understand that this is most likely a bug in ToI, not in xfs. Does this
give you a better idea of how to solve the problem? The only xfs-specific
patch in ToI is below:

diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 0eda725..55de808 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -511,6 +511,7 @@ xfsaild(
  struct xfs_ail *ailp = data;
  long tout = 0; /* milliseconds */

+ set_freezable();
  current->flags |= PF_MEMALLOC;

  while (!kthread_should_stop()) {

Looking at the code blindly, it appears to be similar to what goes on in
other filesystems...

Regards,
Pedro


On 21 March 2013 17:45, Pedro Ribeiro <pedrib at gmail.com> wrote:

>
>
>
> On 21 March 2013 01:01, Dave Chinner <david at fromorbit.com> wrote:
>
>> On Wed, Mar 20, 2013 at 06:01:35PM +0000, Pedro Ribeiro wrote:
>> > Thanks for the answer Dave.
>> >
>> > Yes I would definitely say it's a ToI bug that perhaps has been dormant
>> so
>> > far. Unfortunately the ToI developer is very busy at the moment, so I
>> will
>> > have to debug and fix it myself.
>> > This problem did not occur with 3.7 and the ToI code did not change.
>> >
>> > Do you have any idea where I can start looking for the XFS change in 3.8
>> > that triggered this behaviour in ToI? Or maybe it was a VFS change?
>>
>> It's almost certainly an XFS change that triggered it, but it
>> indicates (once again) that the hibernate code is simply not
>> quiescing filesystems properly (i.e. by freezing them). The work
>> that caused this problem is stopped by the filesystem when it
>> is frozen, and started again when it is thawed...
>>
>> > PS: the email definitely bounced back, most likely because imageshack is
>> > blocked on the sgi server:
>> >
>> > Technical details of permanent failure:
>> > Google tried to deliver your message, but it was rejected by the server
>> for
>> > the recipient domain oss.sgi.com by cuda-allmx.sgi.com.
>> [192.48.176.16].
>> >
>> > The error that the other server returned was:
>> > 554 rejecting banned content
>>
>> IOWs, a stupid spam filter.
>>
>> I'll see if I can get this fixed.
>>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david at fromorbit.com
>>
>
> Actually I've nailed it down to a commit between 3.7.1 and 3.7.10. I'll do
> some git bisection and come back with the results.
>
> Regarding ToI and filesystem freezing, I guess I need to start delving
> into the code to see if I can fix it - long but fun journey ahead I guess.
>
> Regards,
> Pedro
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130327/6e8a2510/attachment.html>


More information about the xfs mailing list