xfs deadlock in stable kernel 3.0.4

Stefan Priebe - Profihost AG s.priebe at profihost.ag
Wed Sep 14 02:26:18 CDT 2011


Hi,

Am 13.09.2011 22:50, schrieb Christoph Hellwig:
> On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote:
>> I just reported it to the scsi list as i didn't knew where the
>> problems is. But then some people told be it must be a XFS problem.
>>
>> Some more informations:
>> 1.) It's running with 2.6.32 and 2.6.38
>> 2.) I can also write to another ext2 part on the same disk
>> array(aacraid driver) while xfs stucks - so i think it must be an
>> xfs problem
>
> That points a bit more towards XFS, although we've seen storage setups
> create issues depending on the exact workload.  The prime culprit for
> used to be the md software RAID driver, though.
>
>> 3.) I've also tried running 3.1-rc5 but then i'm seeing this error:
>>
>> BUG: unable to handle kernel NULL pointer dereference at 000000000000012c
>> IP: [] inode_dio_done+0x4/0x25
>
> Oops, that's a bug that I actually introduced myself.  Fix below:

Thanks for the patch.

Now we have the following situation:

1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch
2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X 
will become the next long term stable. So there will be a lot of people 
using it.
3.) I have seen this deadlock on systems with aacraid and with intel 
ahci onboard. (that's all we're using)
4.) I still write to other devices / raids on the same controller while 
the XFS root filesystem hangs.

What can we do / try now / next?

Stefan




More information about the xfs mailing list