xfs
[Top] [All Lists]

Re: separate log and structure from user data device?

To: Jan Wagner <jwagner@xxxxxxxxxxx>
Subject: Re: separate log and structure from user data device?
From: Keith Owens <kaos@xxxxxxx>
Date: Tue, 06 Jun 2006 00:19:08 +1000
Cc: Nathan Scott <nathans@xxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: Your message of "Mon, 05 Jun 2006 14:22:35 +0300." <Pine.LNX.4.58.0606051402410.18047@kurp.hut.fi>
Sender: xfs-bounce@xxxxxxxxxxx
Jan Wagner (on Mon, 5 Jun 2006 14:22:35 +0300 (EEST)) wrote:
>
>On Mon, 5 Jun 2006, Nathan Scott wrote:
>> On Mon, Jun 05, 2006 at 09:49:10AM +0300, Jan Wagner wrote:
>> > I saw with the logdev parameter it is possible to specify an external log
>> > device on a separate disk and partition.
>> >
>> > What would be even more interesting for my special purposes is whether
>> > even the file system structure (inodes etc) could be placed on a
>> > different disk?
>>
>> The realtime subvolume will indeed give you this split.  See xfs(5)
>> and mkfs/xfs(8) where most doco resides on this.
>
>Thanks for clarifying. That had been a bit unclear to me from the docs.
>
>Also realized only now that xfsctl has to be used to set an empty file's
>realtime bit to really use the rt subvolume, not quite as straightforward
>as I thought. Will have to "correct" my progs a bit.
>
>> > Rationale being, when one wants to build a data recorder like a
>> > Linux personal video recorder that is using A/V harddisks (ATA Streaming
>> > Feature Set, or SmoothStream), one could use the A/V disk with A/V
>> > streaming enabled to unreliably(!) write or read all the user data, and a
>> > second disk to reliably store the actual log and file system structure on.
>> >
>> > Yes, there is alreayd the realtime subvolume in XFS, but can it tolerate
>> > unreliable A/V read/write? (i.e., where drive has been told to disable
>> > and skip all read error correction and write verification?)
>>
>> Not 100% sure what unreliable means here from a software POV... would
>> we be seeing errors at the filesystem layer on IOs to/from the driver?
>
>With A/V feature enabled there are no data I/O errors from the disk, thus
>very likely no errors from the driver either.
>
>The only thing is that the data read from or written to that disk has no
>guarantees to be correct (thus keeping all filesystem structure related
>stuff on an entirely different disk is kind of essential :-))

The ATA Streaming Feature Set defines the Handle Stream Error (HSE) bit
to mark data which is critical, and therefore needs full error
recovery.  That leaves all other data to be handled as best case,
returning no data instead of taking too long.  Why not use HSE to mark
the filesystem metadata and journals?  Then you do not need to separate
metadata from normal data at the disk level.

Of course that requires a change to the VFS layer to pass a flag saying
"this data is critical", plus support in the I/O path for setting HSE.
Until the kernel is changed, your only option is to use a filesystem
that lets you manually separate the two classes of data.  The Streaming
Feature Set is held in the cfsse field and nothing in the kernel uses
cfsse in any significant way, so it will probably be a while before
Linux supports AV mode.


<Prev in Thread] Current Thread [Next in Thread>