Problem with file system on iSCSI FileIO
Richard Sharpe
realrichardsharpe at gmail.com
Sat Sep 25 11:54:46 CDT 2010
On Sat, Sep 25, 2010 at 8:56 AM, Christoph Hellwig <hch at infradead.org> wrote:
>> So once again. We have created a RAID unit level 6. On the top of
>> the unit there is an LVM architecture, I mean a volume group that
>> contains logical volumes. The logical volume is formatted with XFS
>> and it contains one big file that takes almost all of the space on
>> the LV. There is some free space left in order to be able expand the
>> LV and FS in the future. The LV is mounted and the file is served as
>> iSCSI target. The iSCSI Initiator (MS Initiator from Windows 2k3)
>> connects to iSCSI target. The iSCSI disk is formatted with the NTFS.
>
> ok, so we have:
>
> Linux Server
>
> +----------------------+
> | hardware raid 6 |
> +----------------------+
> | lvm2 - linear volume |
> +----------------------+
> | XFS |
> +----------------------+
> | iSCSI target |
> +----------------------+
>
> Windows client:
>
>
> +----------------------+
> | iSCSI initiator |
> +----------------------+
> | NTFS |
> +----------------------+
>
>> But we believe the problem is with the XFS. With unknown reason we
>> are not able to mount the LV and after running xfs_repair the file
>> is missing from the LV. Do you have any ideas how we can try to fix
>> the broken XFS?
>
> This does not sound like a plain XFS issue to me, but an interaction
> between components going completely wrong. Normal I/O to a file
> should never corrupt the filesystem around it to the point where
> it's unusable, and so far I never heard reports about that. The hint
> that this doesn't happen with another purely userspace target is
> interesting. I wonder if SCST that you use does any sort of in-kernel
> block I/O after using bmap or similar? I've not seen that for iscsi
> targets yet but for other kernel modules, and that kind of I/O
> can cause massive corruption on a filesystem with delayed allocation
> and unwritten extents.
>
> Can any of the SCST experts on the list here track down how I/O for this
> configuration will be issued?
>
> What does happen if you try the same setup with say jfs or ext4 instead
> of xfs?
I saw references to vdisk fileio in there and wondered why this was
being done rather than simply exporting the hardware raid 6 device?
Ie, why are all those other layers in there?
fileio uses submit_bio to submit the data and it defaults to
WRITE_THROUGH, NV_CACHE and DIRECT_IO (at least in the trunk, but I
suspect this has been the case for a long while) however, the person
making the complaint might have switched off WRITE_THROUGH in the
pursuit of performance, in which case a crash could corrupt things
badly but it would depend on whether or not clearing WRITE_THROUGH
also clears NV_CACHE and what the code assembling the caching mode
page does (and I have only had a cursory glance at the vdisk code).
What is needed here is the parameters used in configuring the vdisk
and the version of SCST in use.
--
Regards,
Richard Sharpe
More information about the xfs
mailing list