xfs_growfs failure....

Jason Vagalatos Jason.Vagalatos at citrixonline.com
Wed Feb 24 12:08:26 CST 2010


David,
This might provide some useful insight too..  I just remembered that the xfs_growfs command was run twice.  The first time it errored because I omitted the –d option.  I reran it with the –d option and it completed successfully.

Is it possible the xfs_growfs grew the filesystem 2x intended size by running it twice?  It seems that the command would fail the second time because all of the remaining space on the underlying device was used up by the first run?

Thanks,

Jason Vagalatos
Storage Administrator
Citrix|Online
7408 Hollister Avenue
Goleta California 93117
T:  805.690.2943 | M:  805.403.9433
jason.vagalatos at citrixonline.com<mailto:jason.vagalatos at citrixonline.com>
http://www.citrixonline.com<http://www.citrixonline.com/>

From: Jason Vagalatos
Sent: Wednesday, February 24, 2010 9:57 AM
To: 'david at fromorbit.com'
Cc: 'xfs at oss.sgi.com'; Joe Allen
Subject: RE: xfs_growfs failure....


Hi David,
I’m picking this up from Joe.  I’ll attempt to answer your questions.

The underlying device was grown from 89TB to 100TB.  The xfs filesystem utilizes an external logdev.  After the underlying device was grown by approx 11TB, we ran xfs_growfs –d <filesystem_mount_point>.  This command completed without errors, but the filesystem immediately went into a bad state.

We are running xfsprogs-2.8.20-1.el5.centos on RHEL Kernel Linux 2.6.18-53.1.19.el5 #1 SMP Tue Apr 22 03:01:10 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

We killed the xfs_repair before it was able to find a secondary superblock and make things worse.

Currently the underlying block device is:

--- Logical volume ---
  LV Name                /dev/logfs-sessions/sessions
  VG Name                logfs-sessions
  LV UUID                32TRbe-OIDw-u4aH-fUmD-FLmU-5jRv-MaVYDg
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                100.56 TB
  Current LE             26360253
  Segments               18
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:61

At this point what are our options to recover this filesystem?

Thank you for any help you may be able to provide.

Jason Vagalatos

From: Joe Allen
Sent: Wednesday, February 24, 2010 9:15 AM
To: Jason Vagalatos
Subject: Fwd: xfs_growfs failure....


wierd. he seems to be implying we were at 110tb and stuff is written there. I guess we need to be 100% sure of the space allocated. were the other Luns ever attached ?




Begin forwarded message:
From: Dave Chinner <david at fromorbit.com<mailto:david at fromorbit.com>>ng
Date: February 24, 2010 3:54:20 AM PST
To: Joe Allen <Joe.Allen at citrix.com<mailto:Joe.Allen at citrix.com>>
Cc: "xfs at oss.sgi.com<mailto:xfs at oss.sgi.com>" <xfs at oss.sgi.com<mailto:xfs at oss.sgi.com>>
Subject: Re: xfs_growfs failure....
On Wed, Feb 24, 2010 at 02:44:37AM -0800, Joe Allen wrote:
I am in some difficulty here over a 100TB filesystem that
Is now unusable after a xfs_growfs command.

Is there someone that might help assist?

#mount: /dev/logfs-sessions/sessions: can't read superblock

Filesystem "dm-61": Disabling barriers, not supported with external log device
attempt to access beyond end of device
dm-61: rw=0, want=238995038208, limit=215943192576

You've grown the filesystem to 238995038208 ѕectors (111.3TiB),
but the underlying device is only 215943192576 sectors (100.5TiB)
in size.

I'm assuming that you're trying to mount the filesystem after a
reboot? I make this assumption as growfs is an online operation and
won't grow if the underlying block device has not already been
grown. For a subsequent mount to fail with the underlying device
being too small, something about the underlying block
device had to change....

What kernel version and xfsprogs version are you using?

xfs_repair -n <device> basically looks for superblocks (phase 1 I
guess ) for a long time. I'm letting it run, but not much hope.

Don't repair the filesystem - there is nothing wrong with it
unless you start modifying stuff. What you need to do is fix the
underlying device to bring it back to the size it was supposed to
be at when the grow operation was run.

What does /proc/partitions tell you about the size of dm-61? does
that report the correct size, and if it does, what is it?

I'm hesitant to run xfs_repair -L  or without the -n flag for fear of making it worse.

Good - don't run anything like that until you sort out whether the
underlying device is correctly sized or not.

-bash-3.1# xfs_db -r -c 'sb 0' -c p /dev/logfs-sessions/sessions
magicnum = 0x58465342
blocksize = 4096
dblocks = 29874379776

XFS definitely thinks it is 111.3TiB in size.

rblocks = 0
rextents = 0
uuid = fc8bdf76-d962-43c1-ae60-b85f378978a6
logstart = 0
rootino = 2048
rbmino = 2049
rsumino = 2050
rextsize = 384
agblocks = 268435328
agcount = 112

112 AGs of 1TiB each - that confirms the grow succeeded and it was
able to write metadata to disk between 100 and 111 TiB without
errors being reported. That implies the block device must have been
that big at some point...

rbmblocks = 0
logblocks = 32000
versionnum = 0x3184
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 28
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 7291520
ifree = 8514
fdblocks = 6623185597

With 24.5TiB of free space

-bash-3.1# xfs_db -r -c 'sb 2' -c p /dev/logfs-sessions/sessions
magicnum = 0x58465342
blocksize = 4096
dblocks = 24111418368

That's 89.9TiB...

rblocks = 0
rextents = 0
uuid = fc8bdf76-d962-43c1-ae60-b85f378978a6
logstart = 0
rootino = 2048
rbmino = 2049
rsumino = 2050
rextsize = 384
agblocks = 268435328
agcount = 90

And 90 AGs. That tells me the filesystem was created as a 90TiB
filesystem. Can you tell me if you attempted to grow from 90TiB to
100TiB or from 100TiB to 110TiB?  There were bugs at one point in
both the userspace grow code and the kernel code that resulted in
bad grows (hence the need to know the versions this occurred on
and what you were actually attempting to do), but these problems
can usually be fixed up with some xfs_db magic.

Cheers,

Dave.
--
Dave Chinner
david at fromorbit.com<mailto:david at fromorbit.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20100224/2a4f04e8/attachment.htm>


More information about the xfs mailing list