[Top] [All Lists]

Re: Problem with file system on iSCSI FileIO

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: Problem with file system on iSCSI FileIO
From: Richard Sharpe <realrichardsharpe@xxxxxxxxx>
Date: Sat, 25 Sep 2010 09:54:46 -0700
Cc: Slawomir Nowakowski <slawomir.nowakowski@xxxxxxxxxx>, xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=UrqUrh6ACTUYOpUdtI5LVV2tnY4DyIm5Mp+xaWgZL/I=; b=qQit/kfGWjg5ua6t31o1QCkRuemZGdwYal5ClA1VBg/lESBU+qyDJOvP+x2yShZNY6 34fRrUp/1MvWSz0cE33FSbEhIGCX+9fDfj1fZtk2QNlda7otYSdXPlkzBJx1lPYXTRCo S9QKClkBLCiMlGnEO2wiSLk+UTFm3hlCTMIBU=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=WyPW9i+NStHg0MypSL+bzxvGMUBR3fLdNObon5+vFmPUSM7+wBjyfJW5BbH6cWdauJ reUDcdqvuXl2rWan4yEhheC4PojYOZol8VFkukENdo8OGKSrK3jBuQw1jPMp294KUTxz p//5hxUEjYC25frG2ffWsoA94XaVi+BKmfPaI=
In-reply-to: <20100925155611.GA21928@xxxxxxxxxxxxx>
References: <4C9B5786.4010205@xxxxxxxxxx> <20100923143221.GA1989@xxxxxxxxxxxxx> <4C9B6B27.5050606@xxxxxxxxxx> <20100924075505.GA24664@xxxxxxxxxxxxx> <4C9C875D.9050308@xxxxxxxxxx> <20100925155611.GA21928@xxxxxxxxxxxxx>
On Sat, Sep 25, 2010 at 8:56 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>> So once again. We have created a RAID unit level 6. On the top of
>> the unit there is an LVM architecture, I mean a volume group that
>> contains logical volumes. The logical volume is formatted with XFS
>> and it contains one big file that takes almost all of the space on
>> the LV. There is some free space left in order to be able expand the
>> LV and FS in the future. The LV is mounted and the file is served as
>> iSCSI target. The iSCSI Initiator (MS Initiator from Windows 2k3)
>> connects to iSCSI target. The iSCSI disk is formatted with the NTFS.
> ok, so we have:
> Linux Server
> +----------------------+
> |   hardware raid 6    |
> +----------------------+
> | lvm2 - linear volume |
> +----------------------+
> |          XFS         |
> +----------------------+
> |    iSCSI target      |
> +----------------------+
> Windows client:
> +----------------------+
> |    iSCSI initiator   |
> +----------------------+
> |        NTFS          |
> +----------------------+
>> But we believe the problem is with the XFS. With unknown reason we
>> are not able to mount the LV and after running xfs_repair the file
>> is missing from the LV. Do you have any ideas how we can try to fix
>> the broken XFS?
> This does not sound like a plain XFS issue to me, but an interaction
> between components going completely wrong.  Normal I/O to a file
> should never corrupt the filesystem around it to the point where
> it's unusable, and so far I never heard reports about that.  The hint
> that this doesn't happen with another purely userspace target is
> interesting.  I wonder if SCST that you use does any sort of in-kernel
> block I/O after using bmap or similar?  I've not seen that for iscsi
> targets yet but for other kernel modules, and that kind of I/O
> can cause massive corruption on a filesystem with delayed allocation
> and unwritten extents.
> Can any of the SCST experts on the list here track down how I/O for this
> configuration will be issued?
> What does happen if you try the same setup with say jfs or ext4 instead
> of xfs?

I saw references to vdisk fileio in there and wondered why this was
being done rather than simply exporting the hardware raid 6 device?
Ie, why are all those other layers in there?

fileio uses submit_bio to submit the data and it defaults to
WRITE_THROUGH, NV_CACHE and DIRECT_IO (at least in the trunk, but I
suspect this has been the case for a long while) however, the person
making the complaint might have switched off WRITE_THROUGH in the
pursuit of performance, in which case a crash could corrupt things
badly but it would depend on whether or not clearing WRITE_THROUGH
also clears NV_CACHE and what the code assembling the caching mode
page does (and I have only had a cursory glance at the vdisk code).

What is needed here is the parameters used in configuring the vdisk
and the version of SCST in use.

Richard Sharpe

<Prev in Thread] Current Thread [Next in Thread>