[Top] [All Lists]

Re: mount: Structure needs cleaning

To: xfs@xxxxxxxxxxx
Subject: Re: mount: Structure needs cleaning
From: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Date: Mon, 27 Feb 2012 00:28:17 -0600
In-reply-to: <33397518.post@xxxxxxxxxxxxxxx>
References: <33393100.post@xxxxxxxxxxxxxxx> <4F49B693.4080309@xxxxxxxxxxxxxxxxx> <33393429.post@xxxxxxxxxxxxxxx> <20120227004902.GQ3592@dastard> <33397518.post@xxxxxxxxxxxxxxx>
Reply-to: stan@xxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
On 2/26/2012 9:11 PM, MikeJeezy wrote:

>> On 02/26/2012 11:07am, Stan Hoeppner wrote: 
>> Are those other VMs using XFS filesystems? 
> What kernel version are you running?  
> 2.6.18-274.18.1.el5

I'm not familiar enough with Red Hat kernel revs to know if all relevant
patches are included in this kernel.  There are a few Red Hat devs here
who should have more insight on this.

>> Are you using LVM under XFS?  
> No
>> What fstab mount options?  
> /dev/sdd1               /mnt/ob1               xfs     defaults        0 0
> /dev/sde1               /mnt/ob2               xfs     defaults        0 0
>> Does your SAN array have battery backed write cache?  
> This one does not currently, but I have ordered BBWC for it.

Good.  I suggest disabling the SAN controller's write caching until the
BBWC is installed and verified to be functioning correctly.

>> Are the individual drive caches in the underlying array disabled? 
> Write cache: enabled
> Read ahead: enabled

In the case of a SAN array or PCIe RAID controller, this dmesg output is
telling you about the state of the controller's cache, not the
individual drive caches.  Enable/disable of the drive caches should be
an option in the controller firmware interface.  You want the individual
drive write caches disabled.  Leaving their read caches enabled is fine.

The reason is that a power drop, kernel panic, or hardware lockup
(thermal etc) clears the drive write caches before the blocks are
written to the platters.  It is suspected that many/most of these free
space btree corruptions, such as yours here, are caused by data in
caches not being flushed to the platters.  SAN/RAID controllers with
BBWC usually guarantee data in the write cache gets properly flushed to
the platters when the system comes back up.

So, way back when, you may have had a system (VM) crash of one kind or
another, or an improper shutdown (VM power-off), then rebooted, and
everything seemed fine.  Months later, you discover you have a corrupted
free space btree, which was caused by the crash long ago, that everyone
forgot about, never documented, etc.


<Prev in Thread] Current Thread [Next in Thread>