[Top] [All Lists]

Re: TAKE 952214 - log recovery compat on 32/64 bit

To: Shailendra Tripathi <stripathi@xxxxxxxxx>
Subject: Re: TAKE 952214 - log recovery compat on 32/64 bit
From: Timothy Shimmin <tes@xxxxxxx>
Date: Thu, 25 May 2006 15:42:48 +1000
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <4473FDA1.4070805@agami.com>
Organization: SGI
References: <200605240520.k4O5KlJ47446902@snort.melbourne.sgi.com> <4473FDA1.4070805@agami.com>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: KMail/1.8.2
Hi Shailendra,

On Wednesday 24 May 2006 4:30 pm, Shailendra Tripathi wrote:
> Hi Tim,
>         There is another problem in XFS recovery on non-XFS native
> endian machines.
That's no good.

> xlog_recover_do_inode_buffer () {
> ....
>       logged_nextp = (xfs_agino_t *)
>                         ((char *)(item->ri_buf[item_index].i_addr) +
>                                  (next_unlinked_offset -reg_buf_offset));
>                  ...
>                  buffer_nextp = (xfs_agino_t *)xfs_buf_offset(bp,
>                                                next_unlinked_offset);
>                  INT_SET(*buffer_nextp, ARCH_CONVERT, *logged_nextp);
> }
>       Please note that all the fields in the log (rather all the data in the
> log) is already converted into proper endian format. However,
> INT_SET(*buffer_nextp, ARCH_CONVERT, *logged_nextp);
> converts it again which is wrong. It should be ARCH_NOCONVERT here.
Okay, I had a better look at the code, and I see what you mean.
The di_next_unlinked field is always stored endian converted.
It has to be converted to native only when it is used to find the inode that
it is pointing to by its agino#.
So in the case above, we are extracting out the already endian converted
di_next_unlinked field from the logged region data and endian converting
it again and then storing back into the di_next_field of the buffer of inodes.
Which will flip it back to native, when we really want it to remain in ondisk 
Yes, we should just be doing a straight assignment here as we do in IRIX.

> Here is why, then we don't see any problem in recovery:
> xlog_recover_process_iunlinks ()
> ...
>                         } else {
>       xlog_recover_clear_agi_bucket(mp, agno,
>                                                       bucket);
>       agino = NULLAGINO;
> }
> This code assumes that if you detect any bad inode, just clear off the
> AGI unlinked tree. 
Or the list off the AGI unlinked bucket. Yep.

> I have seen this code been hit in recovery. It has 
> potential of loosing the inode and corresponding data blocks forever.
> The xfs_logprint shows CLEAR_AGI_BUCKET after recovery once in a while.
Yeah that's no good but it is good evidence of the problem.

> The code assumes this to be bad inode. However, the truth is that most
> of the time this has been done because of double native conversion.
> I have patch for it on pretty old build (2.6.7) and, hence, not submitting.
Feel free to submit any old patches that you have and just state for which 
release. They would be most appreciated.
In this case, the recovery code hasn't probably changed much either and
the patch is a simple one :)

Thanks a lot for pointing this out, I'll check this fix in soon.

Kind regards,

<Prev in Thread] Current Thread [Next in Thread>