strange behavior of a larger xfs directory

Hans-Peter Jansen hpj at urpla.net
Tue Mar 5 14:32:02 CST 2013


Hi Dave,

thanks for the helpful hints.

Am Dienstag, 5. März 2013, 10:05:27 schrieb Dave Chinner:
> On Mon, Mar 04, 2013 at 05:40:13PM +0100, Hans-Peter Jansen wrote:
> > Hi,
> > 
> > after upgrading the kernel on a server from 2.6.34 to 3.8.1 (x86-32), I
> > suffer from a strange behavior of a larger directory, that a downgrade
> > of the kernel cannot repair.
> 
> TL;DR: problem with an old userspace and 64 bit inodes.
> 
[...]
> 
> > # then it preceeds with getdents64 and fetches already fetched entries
> > 
> > 27177 getdents64(3, {
> > 
> >              {d_ino=4303329151, d_off=78, d_type=DT_UNKNOWN, d_reclen=32,
> >              d_name="Black_Swan"}
>                                   ^^^^^^^^
> 
> And the next valid entry in the directory is offset=78.
> 
> So, what it looks like to me is that whatever is parsing the
> linux_dirent returned by the getdents64() call is choking on the 64
> bit inode number.
> 
> Now, given that strace is parsing it correctly, this implies that
> whatever is issuing the getdents64 call is not parsing the
> linux_dirent64 structure correctly.  In fact, I suspect what is
> happening is that userspace is incorrectly using a struct
> linux_dirent to parse the results and hence it's seeing
> d_off/d_type/d_reclen being invalid due to the resultant structure
> misalignment.
> 
> Further, this is being seen by multiple different vectors, which
> indicates that it is probably the readdir() glibc call that is
> buggy, and not any of the applications.

Well, than the python script and ls should fall flat on their faces, which 
they do not.. Also such a blatant misinterpretation should cause more havoc, 
but most other stat values seem to match expectations. 

Some kind of subtle wreckage happens here..

> First solution: upgrade to a modern userspace.

I wish, but I cannot ATM.

> Second solution: Run 3.8.1, make sure you mount with inode32, and
> then run the xfs_reno tool mentioned on this page:
>
> http://xfs.org/index.php/Unfinished_work
> 
> to find all the inodes with inode numbers larger than 32
> bits and move them to locations with smaller inode numbers.

Okay, I would like to take that route.

I've updated the xfsprogs, xfsdump and xfstests packages in my openSUSE build 
service repo home:frispete:tools to current versions today, and plan to submit 
them to Factory. openSUSE is always lagging in this area.

I've tried to include a build of the xfs_reno tool in xfsprogs, since, as you 
mentioned, others might have a similar need soon. Unfortunately I failed so 
far, because it is using some attr_multi and attr_list interfaces, that aren't 
part of the xfsprogs visible API anymore. Only the handle(3) man page refers 
to them.

Attached is my current state: I've relocated the patch to xfsprogs 3.1.9, 
because it already carries all the necessary headers (apart from attr_multi 
and attr_list). The attr interfaces seem to be collected in libhandle now, 
hence I've added it to the build. 

But now I'm stuck. It's not obvious for me, how the attr_multi_by_handle and 
attr_list_by_handle are supposed to replace the ones that xfs_reno uses, and 
documentation of this stuff is, hmm, sparse..

Could somebody with deeper insight have a look?

TIA && cheers,
Pete
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xfs_reno.patch
Type: text/x-patch
Size: 49800 bytes
Desc: not available
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130305/ffd2dcd5/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xfs_reno_fix.diff
Type: text/x-patch
Size: 939 bytes
Desc: not available
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130305/ffd2dcd5/attachment-0001.diff>


More information about the xfs mailing list