Insane file system overhead on large volume
Martin Steigerwald
Martin at lichtvoll.de
Sat Jan 28 08:55:21 CST 2012
Am Freitag, 27. Januar 2012 schrieb Eric Sandeen:
> On 1/27/12 1:50 AM, Manny wrote:
> > Hi there,
> >
> > I'm not sure if this is intended behavior, but I was a bit stumped
> > when I formatted a 30TB volume (12x3TB minus 2x3TB for parity in RAID
> > 6) with XFS and noticed that there were only 22 TB left. I just
> > called mkfs.xfs with default parameters - except for swith and sunit
> > which match the RAID setup.
> >
> > Is it normal that I lost 8TB just for the file system? That's almost
> > 30% of the volume. Should I set the block size higher? Or should I
> > increase the number of allocation groups? Would that make a
> > difference? Whats the preferred method for handling such large
> > volumes?
>
> If it was 12x3TB I imagine you're confusing TB with TiB, so
> perhaps your 30T is really only 27TiB to start with.
>
> Anyway, fs metadata should not eat much space:
>
> # mkfs.xfs -dfile,name=fsfile,size=30t
> # ls -lh fsfile
> -rw-r--r-- 1 root root 30T Jan 27 12:18 fsfile
> # mount -o loop fsfile mnt/
> # df -h mnt
> Filesystem Size Used Avail Use% Mounted on
> /tmp/fsfile 30T 5.0M 30T 1% /tmp/mnt
>
> So Christoph's question was a good one; where are you getting
> your sizes?
An academic question:
Why is it that I get
merkaba:/tmp> mkfs.xfs -dfile,name=fsfile,size=30t
meta-data=fsfile isize=256 agcount=30, agsize=268435455
blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=8053063650, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =Internes Protokoll bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =keine extsz=4096 blocks=0, rtextents=0
merkaba:/tmp> mount -o loop fsfile /mnt/zeit
merkaba:/tmp> df -hT /mnt/zeit
Dateisystem Typ Größe Benutzt Verf. Verw% Eingehängt auf
/dev/loop0 xfs 30T 33M 30T 1% /mnt/zeit
merkaba:/tmp> LANG=C df -hT /mnt/zeit
Filesystem Type Size Used Avail Use% Mounted on
/dev/loop0 xfs 30T 33M 30T 1% /mnt/zeit
33MiB used on first mount instead of 5?
merkaba:/tmp> cat /proc/version
Linux version 3.2.0-1-amd64 (Debian 3.2.1-2) ([…]) (gcc version 4.6.2
(Debian 4.6.2-12) ) #1 SMP Tue Jan 24 05:01:45 UTC 2012
merkaba:/tmp> mkfs.xfs -V
mkfs.xfs Version 3.1.7
Maybe its due to me using a tmpfs for /tmp:
merkaba:/tmp> LANG=C df -hT .
Filesystem Type Size Used Avail Use% Mounted on
tmpfs tmpfs 2.0G 2.0G 6.6M 100% /tmp
Hmmm, but creating the file on Ext4 does not work:
merkaba:/home> LANG=C df -hT .
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/merkaba-home ext4 224G 202G 20G 92% /home
merkaba:/home> LANG=C mkfs.xfs -dfile,name=fsfile,size=30t
meta-data=fsfile isize=256 agcount=30, agsize=268435455
blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=8053063650, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mkfs.xfs: Growing the data section failed
fallocate instead of sparse file?
And on BTRFS as well as XFS it appears to try to create a 30T file for
real, i.e. by writing data - I stopped it before it could do too much
harm.
Where did you create that hugish XFS file?
Ciao,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
More information about the xfs
mailing list