xfs
[Top] [All Lists]

Re: strange behavior of a larger xfs directory

To: Hans-Peter Jansen <hpj@xxxxxxxxx>
Subject: Re: strange behavior of a larger xfs directory
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 5 Mar 2013 10:05:27 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <4300208.uZ6HVTycB6@xrated>
References: <4300208.uZ6HVTycB6@xrated>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Mar 04, 2013 at 05:40:13PM +0100, Hans-Peter Jansen wrote:
> Hi,
> 
> after upgrading the kernel on a server from 2.6.34 to 3.8.1 (x86-32), I 
> suffer from a strange behavior of a larger directory, that a downgrade 
> of the kernel cannot repair.

TL;DR: problem with an old userspace and 64 bit inodes.

> 27177 open("/video/video/", 
> O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
> 27177 fstat64(3, {st_dev=makedev(8, 65), st_ino=357, st_mode=S_IFDIR|0775, 
> st_nlink=350, 
>               st_uid=223, st_gid=33, st_blksize=4096, st_blocks=40, 
> st_size=16384, 
>               st_atime=2013/03/04-16:12:37, st_mtime=2013/03/04-16:17:52, 
>               st_ctime=2013/03/04-16:17:52}) = 0
> 27177 getdents64(3, {
>               {d_ino=357, d_off=4, d_type=DT_UNKNOWN, d_reclen=24, 
> d_name="."} 
>               {d_ino=128, d_off=6, d_type=DT_UNKNOWN, d_reclen=24, 
> d_name=".."} 
>               {d_ino=367, d_off=12, d_type=DT_UNKNOWN, d_reclen=56, 
> d_name="%Avatar_-_Aufbruch_nach_Pandora"} 
>               {d_ino=368, d_off=18, d_type=DT_UNKNOWN, d_reclen=56, 
> d_name="%Der_Deutsche_Comedy_Preis_2009"}
>               [...]
>               {d_ino=4303329151, d_off=78, d_type=DT_UNKNOWN, d_reclen=32, 
> d_name="Black_Swan"}

That's a 64 bit inode number right there (0x0x1007F977F), and AFAICT
it's the only one in the directory. That was created when you were
running 3.8.1.

>               [...]}) = 4072
> # note: including items, that are missing later on, probably all
> 
> 27177 _llseek(3, 74, [74], SEEK_SET)    = 0

Smoking gun.  That is effectively setting the directory offset to 74
(XFS masks out the upper 32 bits of the directory position because
it is invalid) and so XFS will take that offset and walk to the next
valid dirent and start filling entries from there on the next
getdents64 call. Your filesystem is doing exactly what userspace is
asking it to do.

Ah, I note that all the stat64() calls that follow stop at the dirent
that is at d_off=74. So it appears that userspace is having some
kind of problem related to the above entry.

> # then it preceeds with getdents64 and fetches already fetched entries
> 
> 27177 getdents64(3, {
>              {d_ino=4303329151, d_off=78, d_type=DT_UNKNOWN, d_reclen=32, 
> d_name="Black_Swan"} 
                                  ^^^^^^^^

And the next valid entry in the directory is offset=78.

So, what it looks like to me is that whatever is parsing the
linux_dirent returned by the getdents64() call is choking on the 64
bit inode number.

Now, given that strace is parsing it correctly, this implies that
whatever is issuing the getdents64 call is not parsing the
linux_dirent64 structure correctly.  In fact, I suspect what is
happening is that userspace is incorrectly using a struct
linux_dirent to parse the results and hence it's seeing
d_off/d_type/d_reclen being invalid due to the resultant structure
misalignment.

Further, this is being seen by multiple different vectors, which
indicates that it is probably the readdir() glibc call that is
buggy, and not any of the applications.

First solution: upgrade to a modern userspace.

Second solution: Run 3.8.1, make sure you mount with inode32, and
then run the xfs_reno tool mentioned on this page:

http://xfs.org/index.php/Unfinished_work

to find all the inodes with inode numbers larger than 32
bits and move them to locations with smaller inode numbers.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>