xfs
[Top] [All Lists]

Re: xfstests, bad generic tests 009 and 308

To: xfs@xxxxxxxxxxx
Subject: Re: xfstests, bad generic tests 009 and 308
From: Angelo Dureghello <angelo.dureghello@xxxxxxxxxxx>
Date: Tue, 22 Sep 2015 14:41:06 +0200
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20150921225244.GD19114@dastard>
References: <55FC3E0E.9060506@xxxxxxxxxxx> <20150918224412.GE26895@dastard> <55FFE665.7040004@xxxxxxxxxxx> <20150921225244.GD19114@dastard>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.7.0
Hi Dave,

still thanks for following.

I am using actually gcc-linaro-4.9-2015.05-x86_64_arm-linux-gnueabihf
So gcc-4.9 patched with a bunch of stuff from linaro and built as a
cross compiler from x86-64 to 32 bit arm? ISTR we had a bunch of
different compiler problems at one point that only showed up in
kernels build with a x86-64 to arm cross-compiler.  In case you
haven't guessed, XFS has a history of being bitten by ARM compiler
problems. There's been a lot more problems in the past couple of
years than the historical trend, though.

As it is, I highly recommend that you try a current 4.3 kernel, as
there are several code fixes in the XFS kernel code that work around
compiler issues we know about. AFAIA, the do_div() asm bug that
trips recent gcc optimisations isn't in the upstream kernel yet, but
that can be worked around by setting CONFIG_CC_OPTIMIZE_FOR_SIZE=y
in your build.

Well, i updated to this toolchain recently, but built the kernel also
with an i686 4.9 toolchain, and had exactly same tests errors.
Yes i am always cross compiling for armhf btw.
I am using a 4.1.5-rt from TI,
will try possibly some more recent version and let you know.

I have recent git version of xfstests, but generic/308 shows

#! /bin/bash
# FS QA Test No. 308
#
# Regression test for commit:
# f17722f ext4: Fix max file size and logical block counting of
extent format file
More that one filesystem had problems with maximum file sizes on 32
bit systems. Compare the contents of the test; don't stop reading
because the summary of the test makes you think the rest of the test
is unrelated to the problem at hand.

I have a 16MB partition, and wondering why xfs allows from test 308
to create a 16TB file.

-rw------- 1 root root  17592186044415 Sep 18 09:40 testfile.308
https://en.wikipedia.org/wiki/Sparse_file

QA output created by 009
        1. into a hole
0: [0..39]: hole
daa100df6e6711906b61c9ab5aa16032
        2. into allocated space
cc58a7417c2d7763adc45b6fcd3fa024
        3. into unwritten space
daa100df6e6711906b61c9ab5aa16032
I don't need to look any further to see that something is badly
wrong here. This is telling me that no extents are being allocated
at all, which indicates either fiemap is broken, awk/sed is
broken or misbehaving (and hence mangling the output) or something
deep in the filesystem code is fundamentally broken in some
strange, silent way.

Can you create an xfs filesystem on your scratch device, and
manually run this command and post the output:

# mkfs.xfs -V
# mkfs.xfs <dev>
# mount <dev> /mnt/xfs
# xfs_io -f -c "pwrite 0 64k" -c sync \
            -c "bmap -vp" -c "fiemap -v" \
            -c "falloc 1024k 256k" -c sync \
            -c "pwrite 1088k 64k" -c sync \
            -c "bmap -vp" -c "fiemap -v" \
            /mnt/xfs/testfile

and attach the entire output?

I attached the output. I can be completely wrong, but file system
seems quite reliable for rootfs operations until now. At least,
never had any issue after installing and removing several and several
debian packages.
Only issues i had are created from test 308 that, if left running too long,
it damages the fs.

It would also be good if you can run this command under trace-cmd
and capture all the XFS events that occur during the test. See

Ok, about test 308, the 2 xfs_io operations passes, it stops on the rm exiting
the tests, while trying to erase the 16t file.

# Create a sparse file with an extent lays at one block before old s_maxbytes
offset=$(((2**32 - 2) * $block_size))
$XFS_IO_PROG -f -c "pwrite $offset $block_size" -c fsync $testfile >$seqres.full 2>&1

rm can remove remove correctly this file (17592186040320)

# Write to the block after the extent just created
offset=$(((2**32 - 1) * $block_size))
$XFS_IO_PROG -f -c "pwrite $offset $block_size" -c fsync $testfile >>$seqres.full 2>&1

while rm hangs on removing this file (17592186044415)

Magic sysrq l or t is not helping, nothing useful comes out.
But i collected the strace log. Since the issue is at unlinkat().



Many thanks
Best regards
Angelo


--
Best regards,
Angelo Dureghello

Attachment: mkfs_output.txt
Description: Text document

Attachment: strace_rm_308.txt
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>