Hi all,
I've been seeing generic/04[89] fails on XFS since 4.2-rc1 from time to
time, but the failure isn't reproduced on every test host. Recently I
finally got a host that could reproduce the failure reliably.
It's a regression since 4.1 kernel based on my tests, 4.1 kernel passed
the tests and the failures showed up starting from 4.2-rc1.
What xfstests generic/04[89] test is
[root@dhcp-66-86-11 xfstests]# ./lsqa.pl tests/generic/04[89]
FSQA Test No. 048
Test for NULL files problem
test inode size is on disk after sync
--------------------------------------------------
FSQA Test No. 049
Test for NULL files problem
test inode size is on disk after sync - expose log replay bug
--------------------------------------------------
And the failure is like (test files have zero size)
root@dhcp-66-86-11 xfstests]# ./check generic/048
FSTYP -- xfs (non-debug)
PLATFORM -- Linux/x86_64 dhcp-66-86-11 4.2.0-rc5
MKFS_OPTIONS -- -f -bsize=4096 /dev/sda6
MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/sda6
/mnt/testarea/scratch
generic/048 28s ... - output mismatch (see
/root/xfstests/results//generic/048.out.bad)
--- tests/generic/048.out 2015-07-16 17:28:15.800000000 +0800
+++ /root/xfstests/results//generic/048.out.bad 2015-08-12
18:04:52.923000000 +0800
@@ -1 +1,32 @@
QA output created by 048
+file /mnt/testarea/scratch/969 has incorrect size - sync failed
+file /mnt/testarea/scratch/970 has incorrect size - sync failed
+file /mnt/testarea/scratch/971 has incorrect size - sync failed
+file /mnt/testarea/scratch/972 has incorrect size - sync failed
+file /mnt/testarea/scratch/973 has incorrect size - sync failed
+file /mnt/testarea/scratch/974 has incorrect size - sync failed
...
And I bisected to the following commit
commit e79729123f6392b36450113c6c52074b7d389c85
Author: Tejun Heo <tj@xxxxxxxxxx>
Date: Fri May 22 17:13:48 2015 -0400
writeback: don't issue wb_writeback_work if clean
There are several places in fs/fs-writeback.c which queues
wb_writeback_work without checking whether the target wb
(bdi_writeback) has dirty inodes or not. The only thing
wb_writeback_work does is writing back the dirty inodes for the target
wb and queueing a work item for a clean wb is essentially noop. There
are some side effects such as bandwidth stats being updated and
triggering tracepoints but these don't affect the operation in any
meaningful way.
This patch makes all writeback_inodes_sb_nr() and sync_inodes_sb()
skip wb_queue_work() if the target bdi is clean. Also, it moves
dirtiness check from wakeup_flusher_threads() to
__wb_start_writeback() so that all its callers benefit from the check.
While the overhead incurred by scheduling a noop work isn't currently
significant, the overhead may be higher with cgroup writeback support
as we may end up issuing noop work items to a lot of clean wb's.
Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxx>
Attachments are my xfstests config file and host info requested by
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
If you need more info please let me know.
Thanks,
Eryu
hostinfo
Description: Text document
local.config
Description: Text document
|