xfs
[Top] [All Lists]

Re: [PATCH 10/10] xfstests: add disk failure simulation test

To: Rich Johnston <rjohnston@xxxxxxx>
Subject: Re: [PATCH 10/10] xfstests: add disk failure simulation test
From: Dmitry Monakhov <dmonakhov@xxxxxxxxxx>
Date: Sat, 02 Mar 2013 05:49:07 +0400
Cc: xfs@xxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, dchinner@xxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:from:to:cc:subject:in-reply-to:references :user-agent:date:message-id:mime-version:content-type; bh=tTKkAvAp6uuQgPU0mzLDA6RLGbSoWJpPQeknWwNHAUk=; b=JGiwlg4VU+RvRiEbzMFs3LYx/DW+4ZRCCO6O1aKHZCCPyWABe+oCP0Dwp1WOK4DBdd P0YuyHzCwrOTrq2OztUMM6ZQNMPM6Enbow0dq8/drlr2jxrLH9fOdMR3yEJ58GbbOB9l MaEY5lGotuCk5p/RrUgaLZi5QTfghSnrMgaCWpe1o7JAfY03muAFcxw+JHN/6NqEgPXJ 2f/sHxzClYQ0ST0pf4Rah7YEZ5P0Y0TT/CTxzBdbUiWsSxNbrnINYClEfKilkJUBAy/J 9mvxl1p7XEiKoMgFm5IHKHX5OJqgXYZqW9aUQt8xGlQvzfQhXscs475oA+Ygo/TvdCLY gtnw==
In-reply-to: <51310B63.4070105@xxxxxxx>
References: <1361356935-29153-1-git-send-email-dmonakhov@xxxxxxxxxx> <1361356935-29153-11-git-send-email-dmonakhov@xxxxxxxxxx> <51310B63.4070105@xxxxxxx>
Sender: Dmitry Monakhov <rjevskiy@xxxxxxxxx>
User-agent: Notmuch/0.6.1 (http://notmuchmail.org) Emacs/23.3.1 (x86_64-redhat-linux-gnu)
On Fri, 1 Mar 2013 14:11:15 -0600, Rich Johnston <rjohnston@xxxxxxx> wrote:
> On 02/20/2013 04:42 AM, Dmitry Monakhov wrote:
> > There are many situations where disk may fail for example
> > 1) brutal usb dongle unplug
> > 2) iscsi (or any other netbdev) failure due to network issues
> > In this situation filesystem which use this blockdevice is
> > expected to fail(force RO remount, abort, etc) but whole system
> > should still be operational. In other words:
> > 1) Kernel should not panic
> > 2) Memory should not leak
> > 3) Data integrity operations (sync,fsync,fdatasync, directio) should fail
> >     for affected filesystem
> > 4) It should be possible to umount broken filesystem
> >
> > Later when disk becomes available again we expect(only for journaled 
> > filesystems):
> > 5) It will be possible to mount filesystem w/o explicit fsck (in order to 
> > caught
> 
> typo                                     s/caught/catch/g
> 
> >     issues like https://patchwork.kernel.org/patch/1983981/)
> > 6) Filesystem should be operational
> > 7) After mount/umount has being done all errors should be fixed so fsck 
> > should
> >     not spot any issues.
> >
> > This test use fault enjection (CONFIG_FAIL_MAKE_REQUEST=y config option )
>    May want to mention all the kernel config options required.
> i.e. CONFIG_FAULT_INJECTION=y ... are there others?
> CONFIG_FAULT_INJECTION_DEBUG_FS=y ???
Yes, all three options are required. 
> 
> > which force all new IO requests to fail for a given device. Xfs already has
>    to force
> 
> > XFS_IOC_GOINGDOWN ioctl which provides similar behaviour, but it is fs 
> > speciffic
> 
> typos s/behaviour/behavior/g  s/speciffic/specific
>                                          > and it does it in an easy way 
> because it perform freeze_bdev() before actual
> > shotdown.
> typo s/shotdown/shutdown/g
Agree with your diagnosis. My gramma is bad and I've forget to call spell check
before submission. Should I resend this one or you fix it manually
on commit time?
> 
> >
> > Test run fsstress in background and then force disk failure.
> > Once disk failed it check that (1)-(4) is true.
>    Once the disk fails, check that (1)-(4) are true.
> 
> > Then makes disk available again and check that (5)-(7) is also true
>         make the disk ...                                 are
> >
> > BE CAREFUL!! test known to cause memory corruption for XFS
> > see: https://gist.github.com/dmonakhov/4953045
> >
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

<Prev in Thread] Current Thread [Next in Thread>