xfs
[Top] [All Lists]

Re: filesystem shrinks after using xfs_repair

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: filesystem shrinks after using xfs_repair
From: Eli Morris <ermorris@xxxxxxxx>
Date: Sun, 25 Jul 2010 23:46:29 -0700
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20100726060604.GF7362@dastard>
References: <DFB2DB04-A3BA-4272-A12A-4F28A7D51491@xxxxxxxx> <20100712134743.624249b2@xxxxxxxxxxxxxxxxxxxx> <274A8D0C-4C31-4FB9-AB2D-BA3C31D497E0@xxxxxxxx> <20100724005426.GN32635@dastard> <F2AC32C3-2437-4625-980A-3BC9B3C541A2@xxxxxxxx> <20100724023922.GP32635@dastard> <777100A1-57DE-4DE0-B1F0-64977BD694AD@xxxxxxxx> <20100726034545.GE655@dastard> <10B6F36F-BE01-4BF7-9815-2E8F6BF71B41@xxxxxxxx> <20100726060604.GF7362@dastard>
On Jul 25, 2010, at 11:06 PM, Dave Chinner wrote:

> On Sun, Jul 25, 2010 at 09:04:03PM -0700, Eli Morris wrote:
>> On Jul 25, 2010, at 8:45 PM, Dave Chinner wrote:
>>> On Sun, Jul 25, 2010 at 08:20:44PM -0700, Eli Morris wrote:
>>>> [root@nimbus vm]#  echo 3 > /proc/sys/vm/drop_caches
>>>> [root@nimbus vm]# for ag in `seq 0 1 125`; do
>>>>> xfs_db -r -c "sb $ag" -c "p agcount" -c "p dblocks" /dev/vg1/vol5
>>>>> done
>>>> agcount = 126
>>>> dblocks = 13427728384
>>>> agcount = 126
>>>> dblocks = 13427728384
>>> ....
>>> 
>>> All nice and consistent before.
>>> 
>>>> [root@nimbus vm]# umount /export/vol5
>>>> [root@nimbus vm]#  echo 3 > /proc/sys/vm/drop_caches
>>>> [root@nimbus vm]# for ag in `seq 0 1 125`; do
>>>>> xfs_db -r -c "sb $ag" -c "p agcount" -c "p dblocks" /dev/vg1/vol5
>>>>> done
>>>> agcount = 156
>>>> dblocks = 16601554944
>>>> agcount = 126
>>>> dblocks = 13427728384
>>>> agcount = 126
>>>> dblocks = 13427728384
>>> .....
>>> 
>>> And after the grow only the primary superblock has the new size and
>>> agcount, which is why repair is returning it back to the old size.
>>> Can you dump the output after the grow for 155 AGs instead of 125
>>> so we can see if the new secondary superblocks were written? (just
>>> dumping `seq 125 1 155` will be fine.)
> 
> Which shows:
> 
>> agcount = 126
>> dblocks = 13427728384
>> agcount = 126
>> dblocks = 13427728384
> ....
> 
> Well, that's puzzling. The in-memory superblock is written to each
> of the secondary superblocks, and that _should_ match the primary
> superblock. The in-memory superblock is what is modified
> during the growfs transaction and it is them synchronously written
> to each secondary superblock. Without any I/O errors, I'm not sure
> what is happening here.
> 
> Oh, I just noticed this from your previous mail:
> 
>> [root@nimbus vm]# df -h
>> Filesystem            Size  Used Avail Use% Mounted on
> .....
>> /dev/mapper/vg1-vol5   51T   51T   90M 100% /export/vol5
>                         ^^^^^^^^^^^^^^^
> 
>> [root@nimbus vm]# umount /export/vol5
>> [root@nimbus vm]#  echo 3 > /proc/sys/vm/drop_caches
>> [root@nimbus vm]# for ag in `seq 0 1 125`; do
>>> xfs_db -r -c "sb $ag" -c "p agcount" -c "p dblocks" /dev/vg1/vol5
>>> done
>> agcount = 156
>> dblocks = 16601554944
>  ^^^^^^^^^^^^^^^^^^^^^
> 
> These don't match up - we've got the situation where the on-disk
> value for the primary superblock has changed, but the in-memory
> value has not appeared to change.
> 
> And I note from the original email I asked for data from that the
> filesystem did not show up as 62TB until you unmounted and mounted
> it again, which would have read the 62TB size from the primary
> superblock on disk during mount. You do not need to unmount and
> remount to see the new size. This leads me to believe you are
> hitting one (or more) of the growfs overflow bugs that was fixed a
> while back.
> 
> I've just confirmed that the problem does not exist at top-of-tree.
> The following commands gives the right output, and the repair at the
> end does not truncate the filesystem:
> 
> xfs_io -f -c "truncate $((13427728384 * 4096))" fsfile
> mkfs.xfs -f -l size=128m,lazy-count=0 -d 
> size=13427728384b,agcount=126,file,name=fsfile
> xfs_io -f -c "truncate $((16601554944 * 4096))" fsfile
> mount -o loop fsfile /mnt/scratch
> xfs_growfs /mnt/scratch
> xfs_info /mnt/scratch
> umount /mnt/scratch
> xfs_db -c "sb 0" -c "p agcount" -c "p dblocks" -f fsfile
> xfs_db -c "sb 1" -c "p agcount" -c "p dblocks" -f fsfile
> xfs_db -c "sb 127" -c "p agcount" -c "p dblocks" -f fsfile
> xfs_repair -f fsfile
> 
> So rather than try to triage this any further, can you upgrade your
> kernel/system to something more recent?
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx

Hi Dave,

I can update this to Centos 5 Update 4, but I can't install updates forward of 
it's release date of Dec 15, 2009. The reason is that this is the head node of 
a cluster and it uses the Rocks cluster distribution. The newest of Rocks is 
based on Centos 5 Update 4, but Rocks systems do not support updates (via yum, 
for example). 

Updating the OS takes me a day or two for the whole cluster and all the user 
programs. If you're pretty sure that will fix the problem, I'll go for it 
tomorrow. I'd appreciate it very much if you could let me know if Centos 5.4 is 
recent enough that it will fix the problem.

I will note that I've grown the filesystem several times, and while I recall 
having to unmount and remount the filesystem each time for it to report its new 
size, I've never seen it fall back to its old size when running xfs_repair. In 
fact, the original filesystem is about 12 TB, so xfs_repair only reverses the 
last grow and not the previous ones.

thanks again for your help,

Eli

<Prev in Thread] Current Thread [Next in Thread>