file preallocation without unwritten flag being set
p v
pvlogin at yahoo.com
Wed May 13 16:05:16 CDT 2009
doesn't seem to work - I tried to clear the extflg in the versionnum of the superblock (in every copy of it as well) but it doesn't work. The flag is still set on all extents.
xfs_db> version
versionnum [0xb4a4+0x8] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2
xfs_db> version 0xa4a4 0x8
versionnum [0xa4a4+0x8] = V4,NLINK,ALIGN,DIRV2,LOGV2,MOREBITS,ATTR2
typeset -i agcount=$(xfs_db -c "sb" -c "print" /dev/sda | grep agcount)
typeset -i i=0
while [[ $i != $agcount ]]
do
xfs_db -x -c "sb $i" -c "write versionnum 0xa4a4" /dev/sda
i=i+1
done
And once I make the file xfs_repair complains and resets the sb flag - my guess is that in the extent allocation path it is hardcoded for the version 4 - any extent allocated beyond file size will get the flag ...
Also - 2 questions -
1) what is inode64 and where can I find out all of the undocumented mkfs/mount options (it's unfortunate that such a good fs doesnt' have a correspondingly good documentation)
2) why is the largest extent size limited to xxx blocks(can't find out thenumber - when does the inode get finally flushed? ls -i reports 19 as the inode number but even after unmounting inode 19 in xfs_db still shows a free inode - is it still only in the log???) ? I assumed that xfs_bmap gets me the correct number of extents but now looking at the inode with xfs_db it's obvious that xfs_bmap reports contiguous ranges rather than actual extents in the blockmap tree
thx
Peter Vajgel
----- Original Message ----
From: Eric Sandeen <sandeen at sandeen.net>
To: p v <pvlogin at yahoo.com>
Cc: xfs at oss.sgi.com
Sent: Tuesday, May 12, 2009 10:08:48 PM
Subject: Re: file preallocation without unwritten flag being set
p v wrote:
>
>
> I want to avoid any metadata modifications while doing O_DIRECT reads
> (the fs is mounted with noatime). Right now I am doing it mostly for
> testing - I am seeing a performance degradation going from raw to xfs
> on a 10TB filesystem - probably due to my application but I am trying
> to narrow it down so I am starting with running randomio benchmark on
> raw - then 10TB file, then 10 1TB files, then 100 100GB files, ...
you may want to try the inode64 mount option so the allocator is free to
roam your whole 10T ...
> But in general certain applications can definitely take care of the
> preallocated space (db, FB haystack, ...).
Ok, so it sounds like you do understand the implications and you want to
be able to write into prealloc space without any metadata updates as
they are converted to initialized extents... :)
> What they require is
> minimal fragmentation so they would prefer to preallocate the space
> (fill the whole fs with contigous files) and then maintain in-files
> app specific metadata (such as valid offsets of initialized data,
> ...). What I would really like is to have vxfs equivalent of setext
> options -
>
> setext -r <reservation> -f chggsize
>
> And on top of that I would really love to have is vxfs equivalent of
> "nomtime" mount option. Then with O_DIRECT I have raw-like
> performance.
>
> With the unwritten mkfs option I could get the setext semantics. So
> what's the trick (before I dive into the xfs layout)? I am guessing
> that there is no equivalent for nomtime option?
well, the unwritten=0 option did get removed:
http://git.kernel.org/?p=fs/xfs/xfsprogs-dev.git;a=commitdiff;h=8d537733f52a642d471f6781f32f306241dd4308
TBH I'm not entirely sure why.
The unwritten flag is per-filesystem not per-file; you can still clear
that feature bit:
#define XFS_SB_VERSION_EXTFLGBIT 0x1000
by using xfs_db in -x expert mode to rewrite every superblock's
"versionnum" without that bit set.
The xfs_db "version" command will give you a more textual representation
of what is actually set before & after.
You could script the sb rewrites...
For what it's worth, your xfs_db tricks below to preallocate seem a bit
... tricky.
This should suffice:
xfs_io -f /hay/foo
xfs_io> resvsp 0 1024g
xfs_io> truncate 1024g
xfs_io> quit
Oh and you're right, there's no "nomtime" option AFAIK.
-Eric
> Thanks
>
> Peter Vajgel
More information about the xfs
mailing list