xfs
[Top] [All Lists]

Re: Strange problems with xfs an SLESS11 SP2

To: "Hammer, Marcus" <Marcus.Hammer@xxxxxxxx>
Subject: Re: Strange problems with xfs an SLESS11 SP2
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Fri, 11 May 2012 23:09:06 -0500
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
In-reply-to: <CBD2D5A5.7C6B%Marcus.Hammer@xxxxxxxx>
References: <CBD2D5A5.7C6B%Marcus.Hammer@xxxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
On 5/11/12 7:41 AM, Hammer, Marcus wrote:
> Hello,
> 
> We have upgraded from SLES11 SP1 to SLES11 SP2. We use an exotic ERP
> System, which stores the data in CISAM Files, which we store on
> several mounted xfs filesystems ( /disk2, /disk3, /disk4, /disk5 and
> /disk6)
> The machine is a DELL R910 with 256 GB RAM and installed SLES11 SP2
> (before we used SLES11 SP1). So we also got the new 3.0 kernel after
> the upgrade. The xfs mounts are LUNs on a netapp storage mapped via
> fibre channel to the
> Linux host. Also we use multipathd to have several paths to the
> netapp storage LUNs.
> 
> Now after the upgrade to SLES11 SP2 we encountered a strange change on the 
> xfs filesystem /disk5:
> 
> The /disk5 is a frequent accessed xfs filesystem by the ERP system.
> The disk usage increased from 53% to 76-78%. 

as measured by df?  This probably is the somewhat aggressive preallocation, as
Stefan suggested in another email.

> But only the disk usage,
> the size of the files are completely the same. The defragmentation
> increased to 96%
> 
> linuxsrv1:/disk4/ifax/0000 # xfs_db -c frag -r 
> /dev/mapper/360a98000486e59384b34497248694170
> actual 56156, ideal 2014, fragmentation factor 96.41%

so on average, about 28 extents per file.  And what was it before?

See also 
http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25.__Is_that_bad.3F

> linuxsrv1:/disk4/ifax/0000 # xfs_info 
> /dev/mapper/360a98000486e59384b34497248694170
> meta-data=/dev/mapper/360a98000486e59384b34497248694170 isize=256    
> agcount=21, agsize=3276800 blks
>          =                       sectsz=512   attr=0
> data     =                       bsize=4096   blocks=68157440, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=25600, version=1
>          =                       sectsz=512   sunit=0 blks, lazy-count=0
> realtime =none                   extsz=4096   blocks=0, rtextents=0

Ben, with logv1 and 21 AGs it must be an older, migrated fs :)

> 
> The fstab for xfs mounts are without any special options or optimizations, 
> here is the snipped from /etc/fstab:
> 
> /dev/mapper/360a98000486e59384b3449714a47336c   /disk2  xfs     defaults      
>   0 2
> /dev/mapper/360a98000486e59384b34497247514a56   /disk3  xfs     defaults      
>   0 2
> /dev/mapper/360a98000486e59384b34497248694170   /disk4  xfs     defaults      
>   0 2
> /dev/mapper/360a98000486e59384b344972486f6d4e   /disk5  xfs     defaults      
>   0 2
> /dev/mapper/360a98000486e59384b3449724e6f4266   /disk6  xfs     defaults      
>   0 2
> /dev/mapper/360a98000486e59384b3449724f326662   /opt/usr        xfs     
> defaults        0 2
> 
> But something must have been changed in xfs, because now the metadata
> increased so massive, we never had this before with SLES11 SP1.

How are you measuring "the metadata increase?" - I'm not sure what you mean by 
this.

> I did a defragmentation with xfs_fsr and the metadata and usage
> decreased to 53%. But after 1 hour in production we are agin on
> 76-78% disk usage and this defragmentation
> 
> So my question is what has changed from 2.6 kernels 3.0 kernels,
> which can explain this massive increase of metadata. (I did a defrag
> and we had sometimes over 140.000 extends to one inode).

How are the files being written?  Do they grow, are they sparse, direct IO
or buffered, etc?

> I am completely confused and do now know how to handle this. Perhaps
> somebody can help me to fix this problem or to understand what
> happens here….
> I also talked with some netapp engineers and they said, I should ask
> at xfs.org.
> 
> One the filesystem are about 727 CISAM Files (IDX -> Index Files and
> DAT –> DATA Files). There are ten 15 GB Files on which some small
> content is often changed by the ERP system. The rest of the files are
> lower than 400 MB.
> We encounter this problem since the upgrade to SLES11 SP2 and the new
> kernel 3.0. (By the way we had to disable the transparent hugepages
> support in kernel 3.0, because of kernel crashes ;) - but this is a
> different story… )

You can defeat the speculative preallocation by mounting with the
allocsize option, if you want to test that theory.

-Eric

> --
> Mit freundlichen Grüßen/Kind regards
> 
> M.  Hammer
> System administration
> Information Technology
> 
> AUMA Riester GmbH & Co. KG
> Aumastr. 1 • 79379 Muellheim/Germany
> Tel/Phone +49 7631 809-1620 • Fax +49 7631 809-71620
> HammerM@xxxxxxxx<mailto:hammerm@xxxxxxxx> • www.auma.com<http://www.auma.com/>
> 
> Sitz: Müllheim, Registergericht Freiburg HRA 300276
> phG: AUMA Riester Verwaltungsgesellschaft mbH, Sitz: Müllheim, 
> Registergericht Freiburg HRB 300424
> Geschäftsführer: Matthias Dinse, Henrik Newerla
> 
> Registered Office: Muellheim, court of registration: Freiburg HRA 300276
> phG: Riester Verwaltungsgesellschaft mbH Registered Office: Muellheim, court 
> of registration: Freiburg HRB 300424
> Managing Directors: Matthias Dinse, Henrik Newerla
> 
> 
> 
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
> 

<Prev in Thread] Current Thread [Next in Thread>