xfs
[Top] [All Lists]

xfs_repair segfault + debug info

To: <xfs@xxxxxxxxxxx>
Subject: xfs_repair segfault + debug info
From: Mike Grant <mggr@xxxxxxxxx>
Date: Fri, 29 May 2015 15:03:57 +0100
Delivered-to: xfs@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
We recently had a 180TB XFS filesystem go down after following some
ill-considered advice from a Dell tech (re-onlining a maybe-failed disk,
which one might think was ok..).  It's not irreplaceable data, but
xfs_repair segfaults when trying to fix up and I thought that might be
of interest here to help fix the segfault.  We're not expecting to
recover the data, though it would be nice.

Partial logs & backtraces of xfs_repair runs using the latest Centos-7
xfsprogs package and also run with the xfs_repair built from the git
master, copies of core dumps and a metadump are at:
 https://rsg.pml.ac.uk/shared_files/mggr/xfs_segfault

Maximum memory use was only about 1GB by the time of the crash, and
there was 120GB+ of swap available, so I don't think that was an issue.
 The command was "xfs_repair -v /dev/md0 -t 60 -P".

Run time is about 2 hours to a crash and we'll probably want to wipe and
rebuild the server next week sometime.  Happy to run tests or gather
more info in the meantime though!

Please let me know if there's anything else that would be useful.

Cheers,

Mike.


Please visit our new website at www.pml.ac.uk and follow us on Twitter  
@PlymouthMarine

Winner of the Environment & Conservation category, the Charity Awards 2014.

Plymouth Marine Laboratory (PML) is a company limited by guarantee registered 
in England & Wales, company number 4178503. Registered Charity No. 1091222. 
Registered Office: Prospect Place, The Hoe, Plymouth  PL1 3DH, UK. 

This message is private and confidential. If you have received this message in 
error, please notify the sender and remove it from your system. You are 
reminded that e-mail communications are not secure and may contain viruses; PML 
accepts no liability for any loss or damage which may be caused by viruses.

<Prev in Thread] Current Thread [Next in Thread>