xfs
[Top] [All Lists]

Re: xfs corruption issue

To: Danny Shavit <danny@xxxxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: xfs corruption issue
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Wed, 01 Apr 2015 13:12:28 -0400
Cc: Lev Vainblat <lev@xxxxxxxxxxxxxxxxx>, Alex Lyakas <alex@xxxxxxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CAC=x_0iFLbJwbKCKEe7XTKexex29wvbVQDvuN=SO5j9gX=u4rw@xxxxxxxxxxxxxx>
References: <CAC=x_0iFLbJwbKCKEe7XTKexex29wvbVQDvuN=SO5j9gX=u4rw@xxxxxxxxxxxxxx>
On 4/1/15 10:09 AM, Danny Shavit wrote:
> Hello Dave,
> My name is Danny Shavit and I am with Zadara storage.
> We will appreciate your feedback reagrding an xfs_corruption and xfs_reapir 
> issue.
> 
> We found a corrupted xfs volume in one of our systems. It is around 1 TB size 
> and about 12 M files.
> We run xfs_repair on the volume which succeeded after 42 minutes.
> We noticed that memory consumption raised to about 7.5 GB.
> Since some customers are using only 4GB (and sometimes even 2 GB) we tried 
> running "xfs_repair -m 3200" on a 4GB RAM machine.
> However, this time an OOM event happened during handling of AG 26 during step 
> 3.
> The log of xfs_repair is enclosed below.
> We will appreciate your feedback on the amount of memory needed for 
> xfs_repair in general and when using "-m" option specifically.
> The xfs metadata dump (prior to xfs_repair) can be found here:
> https://zadarastorage-public.s3.amazonaws.com/xfs/xfsdump-prod-ebs_2015-03-30_23-00-38.tgz
> It is a 1.2 GB file (and 5.7 GB uncompressed).
> 
> We will appreciate your feedback on the corruption pattern as well.
> -- 
> Thank you,
> Danny Shavit
> Zadarastorage
> 
> ---------- xfs_repair log  ----------------

Just a note ...

> bad . entry in directory inode 5691013154, was 5691013170: correcting

101010011001101011111100000100100
101010011001101011111100000110100
                            ^ bit flip

> bad . entry in directory inode 5691013156, was 5691013172: correcting

101010011001101011111100000100100
101010011001101011111100000110100
                            ^ bit flip

etc ...

> bad . entry in directory inode 5691013157, was 5691013173: correcting
> bad . entry in directory inode 5691013163, was 5691013179: correcting

<Prev in Thread] Current Thread [Next in Thread>