xfs
[Top] [All Lists]

Re: xfs resize: primary superblock is not updated immediately

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: xfs resize: primary superblock is not updated immediately
From: Alex Lyakas <alex@xxxxxxxxxxxxxxxxx>
Date: Tue, 23 Feb 2016 00:38:48 +0200
Cc: xfs@xxxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Danny Shavit <danny@xxxxxxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zadarastorage-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=6jDAselfJkHYQOIEJb/uvFnMXz/aChkOemysMMGqAjc=; b=QbHEF1entScKqfsUY+9HGCigKPRRJDjVNH6T2AsCEnanxdtPftkx9PpEkKJcrNOhtk tWVO1b88UYRJzecZ7L1/MVumQCJJDQuknJJGPo26mwk9WRrRw2TwhVH/ukn4uFQ0tKTc tC+qpH9fAqdGcsZbiD5TTfzsBDdBz9jntTApXgTFB+4j03YbQkusuypjY/xbgNW8+SAO Ro2/PkcoXSDivDkpRtU0DmKtEF0qG8tLRD1wHpzbWz2MKhAs0AiIN7vsmyoyXGJsktg6 Je+N4/j9W/o9AaHuLiMXCTw9hJK0wVJOhDni5aSCY9PDM1IKvpkNiU33BdA0AZEXDMHh cVdQ==
In-reply-to: <20160222212019.GI25832@dastard>
References: <3685DFAD20214109878873CF81232704@alyakaslap> <20160222212019.GI25832@dastard>
Hi Dave,
Thanks for your response.

I am not freezing the filesystem before the snapshot.

However, let's assume that somebody resized the XFS, and it completed
and got back to user-space. At this moment the primary superblock
on-disk is not updated yet with the new agcount. And at this same
moment there is a power-out. After the power comes back and the
machine boots, if we mount the XFS, the same problem would happen, I
believe. Because the primary superblock on-disk still has old agcount.
So the in-memory pag structures will not be created for the new AGs
during mount, but replaying the log might try to use them.

Taking a block-level snapshot is exactly like a power-out from XFS
perspective. And XFS should, in principle, be able to recover from
that. The snapshot will come up as a new block device, which exhibits
identical content as the original block device had at the moment when
the snapshot was taken (like a boot after power-out).

I will try to reproduce the problem by crashing the machine at the
problematic moment, when the primary on-disk superblock still has the
old value. Without the snapshot thing.

Thanks,
Alex.




On Mon, Feb 22, 2016 at 11:20 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Mon, Feb 22, 2016 at 09:08:06PM +0200, Alex Lyakas wrote:
>> Greetings XFS developers,
>>
>> I am seeing the following issue with XFS on kernel 3.18.19.
>>
>> When resizing, XFS adds new AGs and eventually updates the primary
>> superblock with the new âsb_agcountâ value. However, it happens few
>> seconds after the resize operation completes back to user-space. As
>> a result, if a block-level snapshot is taken off the underlying
>> block device, while âsb_agcountâ still has the old value, then
>> subsequent XFS mount crashes with stack like[1].
>
> The primary superblock change is logged, so it doesn't need to be
> written back immediately. That means it is in the journal...
>
>> Some debugging shows that _xfs_buf_find is called with agno that has
>> been added during the resize, but appropriate "pag" has not been
>> created for this agno during mount.
>
> The new per-ag structures are created during growfs, after the
> growfs transaction has committed. if you are mounting a snapshot
> that has the wrong agcount in it, then lots of things will go wrong
> if there is metadata that already uses the expanded space.
>
>> I have found the patch by Christoph Hellwig:
>> http://oss.sgi.com/archives/xfs/2015-01/msg00391.html
>> which sets the resize transaction to be synchronous, and applied it,
>> but it still doesnât help.
>>
>> Right after the resize completes, I am issuing:
>> xfs_db -r -c "sb 0" -c "p"   <device>
>> and for a few seconds still get the old value of âsb_agcountâ.
>>
>> Can anybody advise what am I missing? What needs to be done so that
>> the primary superblock will get the new value of âsb_agountâ
>> promptly?
>
> Are you freezing the filesystem before taking a block level
> snapshot?
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx<div id="DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br />
<table style="border-top: 1px solid #aaabb6;">
        <tr>
                <td style="width: 470px; padding-top: 20px; color: #41424e;
font-size: 13px; font-family: Arial, Helvetica, sans-serif;
line-height: 18px;">This email has been sent from a virus-free
computer protected by Avast. <br /><a
href="https://www.avast.com/sig-email"; target="_blank" style="color:
#4453ea;">www.avast.com</a>
                </td>
        </tr>
</table><a href="#DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1"
height="1"></a></div>

<Prev in Thread] Current Thread [Next in Thread>