On Fri, Jun 22, 2012 at 09:24:14AM +1000, Dave Chinner wrote:
> It may have been - I didn't catch the initial cause of the problem
> in my log because it hard-hung the VM and it wasn't in the
> scrollback buffer on the console. All I saw was a corruption error,
> a shutdown and the stack blowing up.
> Still, I think there is a real problem here - any persistent device
> error on IO submission can cause this problem to occur....
Yes, I was just trying to ask what actually happened as your original
explanation didn't seem to be possible.
I think the patch below should be enough as a minimal fix to avoid the
stack overflow for 3.5. We'll need a much bigger overhaul of the buffer
error handling after that, though.
--- xfs.orig/fs/xfs/xfs_buf.c 2012-06-22 14:20:46.696568355 +0200
+++ xfs/fs/xfs/xfs_buf.c 2012-06-22 14:21:37.733234717 +0200
@@ -1255,7 +1255,7 @@ xfs_buf_iorequest(
- _xfs_buf_ioend(bp, 0);
+ _xfs_buf_ioend(bp, 1);