XFS File system in trouble
Leslie Rhorer
lrhorer at mygrande.net
Tue Aug 4 02:52:33 CDT 2015
It's failing, again. The rsync job failed and when I attempt to untar
the file in the image mount, it fails there, as well. See below. I
formatted a 1.5T drive as xfs and mounted it under /media. I then
dumped the failing FS to a file on /media using xfs_metadump and used
xfs_mdrestore to create an image of the FS. I then mounted the image,
copied over the tarball to its location, and ran tar to extract the files:
RAID-Server:/# mount -o nouuid /media/md0.img /TEST
RAID-Server:/# cd "/TEST/Server-Main/Equipment/Drive
Controllers/HighPoint Adapters/Rocket 2722/Driver"/
RAID-Server:/TEST/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters/Rocket 2722/Driver# cp "/RAID/Server-Main/Equipment/Drive
Controllers/HighPoint Adapters/Rocket 2722/Driver/RR_27xx.tar.gz" ./
RAID-Server:/TEST/Server-Main/Equipment/Drive Controllers/HighPoint
Adapters/Rocket 2722/Driver# tar -xzvf RR_27xx.tar.gz
DC7280/
DC7280/Linux/
DC7280/Linux/Opensource/
DC7280/Linux/Opensource/DC7280-linux-src-v1.0-110621-1313.tar.gz
DC7280/Windows/
DC7280/Windows/Vista-Win2008-Win7/
DC7280/Windows/Vista-Win2008-Win7/x32/
DC7280/Windows/Vista-Win2008-Win7/x32/dc7280.cat
DC7280/Windows/Vista-Win2008-Win7/x32/dc7280.inf
DC7280/Windows/Vista-Win2008-Win7/x32/dc7280.sys
DC7280/Windows/Vista-Win2008-Win7/x64/
DC7280/Windows/Vista-Win2008-Win7/x64/dc7280.cat
DC7280/Windows/Vista-Win2008-Win7/x64/dc7280.inf
DC7280/Windows/Vista-Win2008-Win7/x64/dc7280.sys
DC7280/Windows/Vista-Win2008-Win7/Readme.txt
DC7280/.ddinfo
R272x/
R272x/Linux/
R272x/Linux/Opensource/
R272x/Linux/Opensource/partial/
R272x/Linux/Opensource/partial/include/
...
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/pcitable
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/readme.txt
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/rhdd
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/rhel-install-step1.sh
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-i386/rhel-install-step2.sh
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64:
Cannot mkdir: Structure needs cleaning
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/install.sh
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64:
Cannot mkdir: Input/output error
tar:
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/install.sh:
Cannot open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/installmethod.py
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64:
Cannot mkdir: Input/output error
tar:
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/installmethod.py:
Cannot open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modinfo
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64:
Cannot mkdir: Input/output error
tar:
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modinfo: Cannot
open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modules.alias
tar: RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64:
Cannot mkdir: Input/output error
tar:
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modules.alias:
Cannot open: No such file or directory
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64/modules.cgz
gzip: tar:
RR274x/Driver/Linux/RHEL_CentOS/rr274x_3x-rhel_centos-4u8-x86_64: Cannot
mkdir: Input/output errorstdin: Input/output error
tar: Unexpected EOF in archive
tar: RR274x/Driver/Linux/RHEL_CentOS: Cannot utime: Input/output error
tar: RR274x/Driver/Linux/RHEL_CentOS: Cannot change ownership to uid 0,
gid 1000: Input/output error
tar: RR274x/Driver/Linux/RHEL_CentOS: Cannot change mode to rwxr-xr-x:
Input/output error
tar: RR274x/Driver/Linux: Cannot utime: Input/output error
tar: RR274x/Driver/Linux: Cannot change ownership to uid 0, gid 1000:
Input/output error
tar: RR274x/Driver/Linux: Cannot change mode to rwxr-xr-x: Input/output
error
tar: RR274x/Driver: Cannot utime: Input/output error
tar: RR274x/Driver: Cannot change ownership to uid 0, gid 1000:
Input/output error
tar: RR274x/Driver: Cannot change mode to rwxr-xr-x: Input/output error
tar: RR274x: Cannot utime: Input/output error
tar: RR274x: Cannot change ownership to uid 0, gid 1000: Input/output error
tar: RR274x: Cannot change mode to rwxr-xr-x: Input/output error
tar: Error is not recoverable: exiting now
dmesg:
[131329.013475] XFS (md0): Mounting V4 Filesystem
[131329.918438] XFS (md0): Ending clean mount
[131499.357099] XFS (md0): Mounting V4 Filesystem
[131499.709248] XFS (md0): Ending clean mount
[131874.545344] loop: module loaded
[131874.549914] XFS (loop0): Mounting V4 Filesystem
[131874.555540] XFS (loop0): Ending clean mount
[132020.964431] XFS (loop0): xfs_iread: validation failed for inode
124656869424 failed
[132020.964435] ffff88028b078000: 49 4e 00 00 03 02 00 00 00 30 00 70 00
00 03 e8 IN.......0.p....
[132020.964437] ffff88028b078010: 00 00 00 00 06 20 b0 6f 01 2e 00 00 00
00 00 16 ..... .o........
[132020.964438] ffff88028b078020: 01 57 37 fd 2b 5d 22 9e 1e 0a 61 8c 00
00 00 20 .W7.+]"...a....
[132020.964440] ffff88028b078030: ff ff 00 d2 1b f6 27 90 00 00 00 00 00
00 00 00 ......'.........
[132020.964454] XFS (loop0): Internal error xfs_iread at line 392 of
file /build/linux-QZaPpC/linux-3.16.7-ckt11/fs/xfs/xfs_inode_buf.c.
Caller xfs_iget+0x24b/0x690 [xfs]
[132020.964457] CPU: 2 PID: 21474 Comm: tar Not tainted 3.16.0-4-amd64
#1 Debian 3.16.7-ckt11-1
[132020.964459] Hardware name: To be filled by O.E.M. To be filled by
O.E.M./SABERTOOTH 990FX R2.0, BIOS 1503 01/11/2013
[132020.964460] 0000000000000001 ffffffff8150b405 ffff880424059800
ffffffffa09115cb
[132020.964463] 0000018800000010 ffffffffa0916f6b ffff88030f5c6c00
ffff880424059800
[132020.964465] 0000000000000075 ffff8800ad1afe98 ffffffffa095cb3a
ffffffffa0916f6b
[132020.964467] Call Trace:
[132020.964471] [<ffffffff8150b405>] ? dump_stack+0x41/0x51
[132020.964478] [<ffffffffa09115cb>] ? xfs_corruption_error+0x5b/0x80 [xfs]
[132020.964483] [<ffffffffa0916f6b>] ? xfs_iget+0x24b/0x690 [xfs]
[132020.964492] [<ffffffffa095cb3a>] ? xfs_iread+0xea/0x400 [xfs]
[132020.964497] [<ffffffffa0916f6b>] ? xfs_iget+0x24b/0x690 [xfs]
[132020.964503] [<ffffffffa0916f6b>] ? xfs_iget+0x24b/0x690 [xfs]
[132020.964511] [<ffffffffa0956de6>] ? xfs_ialloc+0xa6/0x500 [xfs]
[132020.964517] [<ffffffffa092658e>] ? kmem_zone_alloc+0x6e/0xe0 [xfs]
[132020.964525] [<ffffffffa09572a2>] ? xfs_dir_ialloc+0x62/0x2a0 [xfs]
[132020.964531] [<ffffffffa09251e5>] ? xfs_trans_reserve+0x1f5/0x200 [xfs]
[132020.964538] [<ffffffffa09579a9>] ? xfs_create+0x489/0x700 [xfs]
[132020.964541] [<ffffffff811b40ea>] ? kern_path_create+0xaa/0x190
[132020.964548] [<ffffffffa091c5ea>] ? xfs_generic_create+0xca/0x250 [xfs]
[132020.964550] [<ffffffff811b7ad0>] ? vfs_mkdir+0xb0/0x160
[132020.964551] [<ffffffff811b868b>] ? SyS_mkdirat+0xab/0xe0
[132020.964554] [<ffffffff815115cd>] ?
system_call_fast_compare_end+0x10/0x15
[132020.964555] XFS (loop0): Corruption detected. Unmount and run xfs_repair
[132020.964564] XFS (loop0): Internal error xfs_trans_cancel at line 959
of file /build/linux-QZaPpC/linux-3.16.7-ckt11/fs/xfs/xfs_trans.c.
Caller xfs_create+0x2b2/0x700 [xfs]
[132020.964566] CPU: 2 PID: 21474 Comm: tar Not tainted 3.16.0-4-amd64
#1 Debian 3.16.7-ckt11-1
[132020.964567] Hardware name: To be filled by O.E.M. To be filled by
O.E.M./SABERTOOTH 990FX R2.0, BIOS 1503 01/11/2013
[132020.964568] 000000000000000c ffffffff8150b405 ffff8800ad1afe98
ffffffffa0925e07
[132020.964570] ffff880002530800 ffff880079e03ec8 ffff880424059800
ffffffffa09577d2
[132020.964571] 0000000000000001 ffff880079e03e20 ffff880079e03e1c
ffff880079e03eb0
[132020.964573] Call Trace:
[132020.964575] [<ffffffff8150b405>] ? dump_stack+0x41/0x51
[132020.964581] [<ffffffffa0925e07>] ? xfs_trans_cancel+0xc7/0xf0 [xfs]
[132020.964588] [<ffffffffa09577d2>] ? xfs_create+0x2b2/0x700 [xfs]
[132020.964590] [<ffffffff811b40ea>] ? kern_path_create+0xaa/0x190
[132020.964596] [<ffffffffa091c5ea>] ? xfs_generic_create+0xca/0x250 [xfs]
[132020.964598] [<ffffffff811b7ad0>] ? vfs_mkdir+0xb0/0x160
[132020.964600] [<ffffffff811b868b>] ? SyS_mkdirat+0xab/0xe0
[132020.964602] [<ffffffff815115cd>] ?
system_call_fast_compare_end+0x10/0x15
[132020.964604] XFS (loop0): xfs_do_force_shutdown(0x8) called from line
960 of file /build/linux-QZaPpC/linux-3.16.7-ckt11/fs/xfs/xfs_trans.c.
Return address = 0xffffffffa0925e20
[132021.196487] XFS (loop0): Corruption of in-memory data detected.
Shutting down filesystem
[132021.196491] XFS (loop0): Please umount the filesystem and rectify
the problem(s)
[132024.791456] XFS (loop0): xfs_log_force: error 5 returned.
[132054.854625] XFS (loop0): xfs_log_force: error 5 returned.
[132084.917775] XFS (loop0): xfs_log_force: error 5 returned.
[132114.980927] XFS (loop0): xfs_log_force: error 5 returned.
[132145.044086] XFS (loop0): xfs_log_force: error 5 returned.
[132175.107307] XFS (loop0): xfs_log_force: error 5 returned.
[132205.170404] XFS (loop0): xfs_log_force: error 5 returned.
[132235.233587] XFS (loop0): xfs_log_force: error 5 returned.
On 8/2/2015 3:24 PM, Leslie Rhorer wrote:
>
> OK, this is goofy. It seems to be working, now. As usual, I've
> been doing some work on the server this weekend, but I can't think of
> anything I have done that would fix the issue. I did replace the
> remaining good 4G RAM module with a pair of 8G RAM modules, but memtest
> reported the remaining 4G module as good, and I verified the removed
> module really was bad. I also replaced the removable drive carrier and
> cables that were feeding the two SSDs, once of which was reporting
> failures as noted in the syslog. It's hard for me to believe either of
> those things could have been causing the issue, though.
>
> I attached a 1.5T external drive to the server and formatted it as
> XFS in preparation to continue troubleshooting. To make sure of things,
> I tried decompressing the tarball, again, and this time it worked all
> the way to the end. I then deleted the entire directory structure
> created by the tarball and decompressed the file again twice. I'll see
> if the rsync process works. That will take a couple of days.
>
> On 7/28/2015 5:11 PM, Brian Foster wrote:
>> On Tue, Jul 28, 2015 at 10:13:01AM -0500, Leslie Rhorer wrote:
>>> On 7/28/2015 7:33 AM, Brian Foster wrote:
>>>> On Tue, Jul 28, 2015 at 02:46:45AM -0500, Leslie Rhorer wrote:
>>>>> On 7/20/2015 6:17 AM, Brian Foster wrote:
>>>>>> On Sat, Jul 18, 2015 at 08:02:50PM -0500, Leslie Rhorer wrote:
>>>>>>>
>> ...
>>>>
>>>>> I then copied both the tarball and the image over to the root,
>>>>> and while
>>>>> the system would not let me create the image on the root, it did
>>>>> let me copy
>>>>> the image to the root. I then umounted the RAID array, mounted the
>>>>> image,
>>>>> and attempted to cd to the original directory in the image mount
>>>>> where the
>>>>> tarball was saved. That failed with an I/O error:
>>>>>
>>>>
>>>> It sounds a bit strange for the mdrestore to fail on root but a cp of
>>>> the resulting image to work. Do the resulting images have the same file
>>>> size or is the rootfs copy truncated? If the latter, you could be
>>>> missing part of the fs and thus any of the following tests are probably
>>>> moot.
>>>
>>> Well, it can't be as large as it is reported, let's put it that way,
>>> although the reported file size is the same. Ls claims it to be 16T in
>>> size, which cannot be the case on a 100G partition. I forgot to
>>> mention cp
>>> does complain:
>>>
>>> RAID-Server:/# cp /RAID/TEST/RAIDfile.img ./
>>> cp: cannot lseek ‘./RAIDfile.img’: Invalid argument
>>>
>>> But it does the same thing on the backup server, and it works
>>> there. I
>>> tried a cmp, and it seems to be hung. It just may be taking a long
>>> time,
>>> however.
>>>
>>
>> Yeah, you can't really trust the resulting image. It doesn't take much
>> space to create a very large sparse file, but different filesystems have
>> different maximum file size limits. The problem here is that some
>> metadata near the beginning of the file might reference or depend on
>> something near the end, and I/Os beyond the end of the file will
>> probably result in errors.
>>
>> I'd probably try the nouuid approach since the hardware is similar as
>> well as some of the other interesting suggestions that have been made to
>> try and get the image on the rootfs and see what happens there too.
>>
>> Brian
>>
>>>> Brian
>>>>
>>>>> RAID-Server:/# cd "/media/Server-Main/Equipment/Drive
>>>>> Controllers/HighPoint
>>>>> Adapters/Rocket 2722/Driver/"
>>>>> bash: cd: /media/Server-Main/Equipment/Drive Controllers/HighPoint
>>>>> Adapters/Rocket 2722/Driver/: Input/output error
>>>>>
>>>>> I changed directories to a point two directories above the
>>>>> previous attempt
>>>>> and did a long listing:
>>>>>
>>>>> RAID-Server:/# cd "/media/Server-Main/Equipment/Drive
>>>>> Controllers/HighPoint
>>>>> Adapters"
>>>>> RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
>>>>> Adapters# ll
>>>>> ls: cannot access RocketRAID 2722: Input/output error
>>>>> total 4
>>>>> drwxr-xr-x 6 root lrhorer 4096 Jul 18 19:26 Rocket 2722
>>>>> ?????????? ? ? ? ? ? RocketRAID 2722
>>>>>
>>>>> As you can see, Rocket 2722 is still there, but RocketRAID 2722
>>>>> is very
>>>>> sick. Rocket 2722 is the parent of where the tarbal was, however,
>>>>> so I did
>>>>> a cd and an ll again:
>>>>>
>>>>> RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
>>>>> Adapters# cd "Rocket 2722"/
>>>>> RAID-Server:/media/Server-Main/Equipment/Drive Controllers/HighPoint
>>>>> Adapters/Rocket 2722# ll
>>>>> ls: cannot access BIOS: Input/output error
>>>>> ls: cannot access Driver: Input/output error
>>>>> ls: cannot access HighPoint RAID Management Software: Input/output
>>>>> error
>>>>> ls: cannot access Manual: Input/output error
>>>>> total 248
>>>>> -rwxr--r-- 1 root lrhorer 245760 Nov 20 2008 autorun.exe
>>>>> -rwxr--r-- 1 root lrhorer 51 Mar 21 2001 autorun.inf
>>>>> ?????????? ? ? ? ? ? BIOS
>>>>> ?????????? ? ? ? ? ? Driver
>>>>> ?????????? ? ? ? ? ? HighPoint RAID
>>>>> Management
>>>>> Software
>>>>> ?????????? ? ? ? ? ? Manual
>>>>> -rwxr--r-- 1 root lrhorer 1134 Feb 5 2012 readme.txt
>>>>>
>>>>> So now, what?
>>>>>
>>>>> _______________________________________________
>>>>> xfs mailing list
>>>>> xfs at oss.sgi.com
>>>>> http://oss.sgi.com/mailman/listinfo/xfs
>>>>
>>>
>>> _______________________________________________
>>> xfs mailing list
>>> xfs at oss.sgi.com
>>> http://oss.sgi.com/mailman/listinfo/xfs
>>
>
More information about the xfs
mailing list