xfstests testcase 111: Infinite xfs_bulkstat bad-inode loop casefrom Roger Willcocks
Roger Willcocks
roger at filmlight.ltd.uk
Mon Dec 22 14:28:59 CST 2008
> Hi Roger,
>
> I believe the xfstests case 111 is based on a report by you. Do you
> remember what was going on there? From a look at the testcase it
> overwrites an inode cluster and then tries to bulkstat them. This works
> fine with a non-debug kernel, but due to debug kernels panicing it fails
> there.
>
> Do you remember what the testcase was looking for? I suspect we should
> just not run it for debug kernels, but I'd like to know more about it
> so we can add comments describing it.
>
> Cheers,
> Christoph
>
Hi Christoph,
here are the relevant extracts from our in-house bugzilla (bug 3675). Since
the problem only occurs when the disk is corrupted, I don't see any problem
with skipping the test on debug kernels.
** 2006-02-01
xfs_fsr can get into a state where one processor spends 100% of its time
looping in the kernel. The application can't be killed. 'top' shows it using
50% CPU (i.e. all of one of the two processors).
oprofile reveals that one processor spends about 2/3 of its time in xfs.ko.
It
looks like the offending syscall is xfs_bulkstat.
** 2006-02-03
Looks like xfs_itobp (map inode number to disk buffer) detects a corrupted
inode (bad magic number). That causes a break out of a loop in xfs_bulkstat,
skipping setting the teminating condition of a containing loop.
I'll file a bug report with SGI.
** 2006-02-03
SGI say 'Ayup, I think you're right'-
http://marc.theaimsgroup.com/?t=113889680200006
** 2006-02-07
A bad inode magic number can cause the xfs_bulkstat syscall to get stuck
looping in the kernel.
To reproduce: (don't try this at home folks!) -
mkfs.xfs /dev/sda
mount filesystem and create 1000 or so files (I copied a handy 313-byte
file).
run this program:
---------
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
char buffer[32768];
void nuke()
{
int i;
for (i = 2048; i < 32768-1; i++)
if (buffer[i] == 'I' && buffer[i+1] == 'N')
buffer[i] = buffer[i+1] = 'X';
}
int main(int argc, char* argv[])
{
int f = open("/dev/sda", O_RDWR);
if (lseek(f, 32768, SEEK_SET) < 0) perror("lseek");
if (read(f, buffer, 32768) != 32768) perror("read");
nuke();
if (lseek(f, 32768, SEEK_SET) < 0) perror("lseek");
if (write(f, buffer, 32768) != 32768) perror("write");
close(f);
}
---------
mount the disk and run xfs_fsr. It immediately gets stuck in a kernel loop.
** 2006-02-08
SGI have added a corresponding regression test to the xfs_cmds package
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfstests/111?rev=1.1
--
Roger
More information about the xfs
mailing list