xfs
[Top] [All Lists]

generic/04[89] fail on XFS due to change in writeback code

To: xfs@xxxxxxxxxxx
Subject: generic/04[89] fail on XFS due to change in writeback code
From: Eryu Guan <eguan@xxxxxxxxxx>
Date: Wed, 12 Aug 2015 18:12:04 +0800
Cc: tj@xxxxxxxxxx, jack@xxxxxxx, axboe@xxxxxx
Delivered-to: xfs@xxxxxxxxxxx
User-agent: Mutt/1.5.23 (2014-03-12)
Hi all,

I've been seeing generic/04[89] fails on XFS since 4.2-rc1 from time to
time, but the failure isn't reproduced on every test host. Recently I
finally got a host that could reproduce the failure reliably.

It's a regression since 4.1 kernel based on my tests, 4.1 kernel passed
the tests and the failures showed up starting from 4.2-rc1.

What xfstests generic/04[89] test is

[root@dhcp-66-86-11 xfstests]# ./lsqa.pl tests/generic/04[89]
FSQA Test No. 048

Test for NULL files problem
test inode size is on disk after sync

--------------------------------------------------
FSQA Test No. 049

Test for NULL files problem
test inode size is on disk after sync - expose log replay bug

--------------------------------------------------

And the failure is like (test files have zero size)

root@dhcp-66-86-11 xfstests]# ./check generic/048
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 dhcp-66-86-11 4.2.0-rc5
MKFS_OPTIONS  -- -f -bsize=4096 /dev/sda6
MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/sda6 
/mnt/testarea/scratch

generic/048 28s ... - output mismatch (see 
/root/xfstests/results//generic/048.out.bad)
    --- tests/generic/048.out   2015-07-16 17:28:15.800000000 +0800
    +++ /root/xfstests/results//generic/048.out.bad     2015-08-12 
18:04:52.923000000 +0800
    @@ -1 +1,32 @@
     QA output created by 048
    +file /mnt/testarea/scratch/969 has incorrect size - sync failed
    +file /mnt/testarea/scratch/970 has incorrect size - sync failed
    +file /mnt/testarea/scratch/971 has incorrect size - sync failed
    +file /mnt/testarea/scratch/972 has incorrect size - sync failed
    +file /mnt/testarea/scratch/973 has incorrect size - sync failed
    +file /mnt/testarea/scratch/974 has incorrect size - sync failed
    ...


And I bisected to the following commit

commit e79729123f6392b36450113c6c52074b7d389c85
Author: Tejun Heo <tj@xxxxxxxxxx>
Date:   Fri May 22 17:13:48 2015 -0400

    writeback: don't issue wb_writeback_work if clean
    
    There are several places in fs/fs-writeback.c which queues
    wb_writeback_work without checking whether the target wb
    (bdi_writeback) has dirty inodes or not.  The only thing
    wb_writeback_work does is writing back the dirty inodes for the target
    wb and queueing a work item for a clean wb is essentially noop.  There
    are some side effects such as bandwidth stats being updated and
    triggering tracepoints but these don't affect the operation in any
    meaningful way.
    
    This patch makes all writeback_inodes_sb_nr() and sync_inodes_sb()
    skip wb_queue_work() if the target bdi is clean.  Also, it moves
    dirtiness check from wakeup_flusher_threads() to
    __wb_start_writeback() so that all its callers benefit from the check.
    
    While the overhead incurred by scheduling a noop work isn't currently
    significant, the overhead may be higher with cgroup writeback support
    as we may end up issuing noop work items to a lot of clean wb's.
    
    Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
    Cc: Jens Axboe <axboe@xxxxxxxxx>
    Cc: Jan Kara <jack@xxxxxxx>
    Signed-off-by: Jens Axboe <axboe@xxxxxx>


Attachments are my xfstests config file and host info requested by
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

If you need more info please let me know.

Thanks,
Eryu

Attachment: hostinfo
Description: Text document

Attachment: local.config
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>