XFS filesystem on EC2 instance corrupts and shuts down
Shrinath M
shrinath.m at webyog.com
Wed Mar 13 20:28:19 CDT 2013
Thanks Ben, Dave and Eric.
Eric,
>>but I am wondering if there might be more information before this which
is not in your trimmed logs.
No, this was the first entry every time we have it in /var/log/messages.
dmesg also holds the same. After reboot, it simply fixes without anyone
doing anything.
The Linux we are running is definitely amazon baked one, looks like this -
$~: uname -a Linux ip-100-0-100-1 3.2.34-55.46.amzn1.x86_64 #1 SMP Tue Nov
20 10:06:15 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
- dmesg shows something like this after repairing/rebooting -
[ 8.414176] SGI XFS with ACLs, security attributes, realtime, large
block/inode numbers, no debug enabled
[ 8.415342] SGI XFS Quota Management subsystem
[ 8.417664] XFS (md0): Mounting Filesystem
[ 8.771553] XFS (md0): Starting recovery (logdev: internal)
[ 9.977325] XFS (md0): Ending recovery (logdev: internal)
Check the first line there, it says no debug enabled. How good/bad is this
debug mode in production environments? We are not getting any corruption in
our local/test environments, in production, we are getting it once on every
third day.
Dave,
You say unlinked inode list, but if that, it should have an entry in
/var/log/messages, right?
Anyway, how can we create this situation? By forcing multiple processes to
write/delete files from small disk? Since we are still unaware of what is
causing this issue, reproducing it in local/production environment is just
shooting in dark... :(
Does turning up the error level affect the data in any way? Or is it *just*
detailed good logging while being sensitive to all small errors?
Really appreciate the support that you devs are giving which really is the
job of AWS support... I so wish they had some helpful and knowledgeable
people in support.
On Thu, Mar 14, 2013 at 5:12 AM, Dave Chinner <david at fromorbit.com> wrote:
> On Wed, Mar 13, 2013 at 01:56:35PM -0500, Eric Sandeen wrote:
> > XFS (md0): xfs_iunlink_remove: xfs_itobp() returned error 117.
>
> Corrupted unlinked inode list. You need to run xfs_repair to fix
> this.
>
> Chers,
>
> Dave.
> --
> Dave Chinner
> david at fromorbit.com
>
--
Regards
*Shrinath.M*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130314/eabea368/attachment.html>
More information about the xfs
mailing list