Kernel crashes with trace ending in XFS code on RHEL6 variant kernel

Jan Kokoska jan at glow.cz
Wed Oct 29 05:00:02 CDT 2014


Hi Eric,

On 28 October 2014 18:42, Eric Sandeen <sandeen at sandeen.net> wrote:

> On 10/28/14 10:38 AM, Jan Kokoska wrote:
> > Hi,
> >
> > I'm running OpenVZ (OS container) kernel variant of RHEL6 kernel on
>
> ... for which we have no source code? ;)
>

Right, I'm sorry, the source code patch on vanilla kernel is linked from
http://openvz.org/Download/kernel/rhel6/042stab084.20
and
http://openvz.org/Download/kernel/rhel6/042stab092.3
for the two kernel versions.

xfs.aops.c differs a bit between the older and the newer version (released
6 months apart), but both kernels crash.

1031,1032c1031,1034
<  * Just skip the page if it is fully outside i_size, e.g. due
<  * to a truncate operation that is in progress.
---
>  * Skip the page if it is fully outside i_size, e.g. due to a
>  * truncate operation that is in progress. We must redirty the
>  * page so that reclaim stops reclaiming it. Otherwise
>  * xfs_vm_releasepage() is called on it and gets confused.
1034,1037c1036,1037
< if (page->index >= end_index + 1 || offset_into_page == 0) {
< unlock_page(page);
< return 0;
< }
---
> if (page->index >= end_index + 1 || offset_into_page == 0)
> goto redirty;

OpenVZ devs unfortunately don't publish their git tree anymore.


> I don't know what's in "2.6.32-openvz-amd64" so can't help much.
>
> What is at line 86 of xfs_aops.c in that kernel?
>

Stefan is right in that it's the line
bh = head = page_buffers(page);
from
xfs_count_page_state()

Eric, thanks for the pointer to ef5d437f71afdf4afdbab99213add99f4b1318fd,
I'll raise it with OpenVZ devs or with RHEL so the bug trickles downstream
to OpenVZ. I simply didn't know how much difference there may be between
XFS parts of the kernel trees that you maintain and that are e.g. in RHEL
and thought it could be a generally occurring bug. Also wanted to get in
touch with the mailing list as I've been using XFS mostly happily for a
decade.

Jan


>
> -Eric
>
> > several amd64 machines by different manufacturers (HP and Supermicro)
> > and different RAID cards (HP and Areca).
> >
> > I've started seeing kernel crashes in October, as per the netconsole
> > logs attached, on two of the machines (one HP, one Supermicro). The
> > traces look quite similar, the machine in question cannot write
> > anything to its own filesystem when this happens so the logs are made
> > over the network. The XFS filesystem is not root (that's ext4), but
> > one for data (OS containers), on both machines. When I run xfs_check
> > and xfs_repair on the filesystem after the kernel crash & reboot, no
> > issue is ever found.
> >
> > This may very well have nothing to do with XFS kernel code you wrote
> > and maintain, but in that case, could you, from looking at the traces,
> > tell me whether it maybe looks like something issue related to
> > vm/paging just ending up in XFS related code path?
> >
> > I'm happy to test any suggestions/fixes for this if it is XFS related.
> >
> > Thank you,
> > --
> > Jan Kokoska
> > Glow Internet s.r.o.
> >
> >
> > _______________________________________________
> > xfs mailing list
> > xfs at oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> >
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>



-- 
S pozdravem

Jan Kokoska
Glow Internet s.r.o.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20141029/199e75f9/attachment.html>


More information about the xfs mailing list