xfs deadlock in stable kernel 3.0.4
Stefan Priebe - Profihost AG
s.priebe at profihost.ag
Wed Sep 14 02:26:18 CDT 2011
Am 13.09.2011 22:50, schrieb Christoph Hellwig:
> On Tue, Sep 13, 2011 at 08:04:36AM +0200, Stefan Priebe - Profihost AG wrote:
>> I just reported it to the scsi list as i didn't knew where the
>> problems is. But then some people told be it must be a XFS problem.
>> Some more informations:
>> 1.) It's running with 2.6.32 and 2.6.38
>> 2.) I can also write to another ext2 part on the same disk
>> array(aacraid driver) while xfs stucks - so i think it must be an
>> xfs problem
> That points a bit more towards XFS, although we've seen storage setups
> create issues depending on the exact workload. The prime culprit for
> used to be the md software RAID driver, though.
>> 3.) I've also tried running 3.1-rc5 but then i'm seeing this error:
>> BUG: unable to handle kernel NULL pointer dereference at 000000000000012c
>> IP:  inode_dio_done+0x4/0x25
> Oops, that's a bug that I actually introduced myself. Fix below:
Thanks for the patch.
Now we have the following situation:
1.) Systems running fine with 2.6.32, 2.6.38 and with 3.1 rc-6 + patch
2.) Sadly it does not run with 3.0.4 for more than 1 hour. And 3.0.X
will become the next long term stable. So there will be a lot of people
3.) I have seen this deadlock on systems with aacraid and with intel
ahci onboard. (that's all we're using)
4.) I still write to other devices / raids on the same controller while
the XFS root filesystem hangs.
What can we do / try now / next?
More information about the xfs