[Top] [All Lists]

Re: XFS File System Monitor

To: Rotem Ben Arye <rotem.benarye@xxxxxxxxx>, support@xxxxxxx
Subject: Re: XFS File System Monitor
From: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Date: Thu, 02 Jan 2014 09:07:01 -0600
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CA+apj_hV4HvZxfASF7JJ1k6mmvio3cRWAHj-S1V=Vm8X_RWA=Q@xxxxxxxxxxxxxx>
References: <CA+apj_hV4HvZxfASF7JJ1k6mmvio3cRWAHj-S1V=Vm8X_RWA=Q@xxxxxxxxxxxxxx>
Reply-to: stan@xxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
On 1/2/2014 6:16 AM, Rotem Ben Arye wrote:
> Hi, SGI Support Team.
> My Name is Rotem , I am a Linux/Unix System Administrator in web company at
> Israel.
> I have a question I want to appeal to you to get some advice.
> In the last weekend we had crisis in one of the Production server in
> the comany ,the problem was defined by the Integrators as  "xfs file system
> corrupted"
> My question is , what are the open source tools , that we can use on
> runtime at production environment , to monitor and sample to get indication
> on mount XFS ,
> That something is not living well, and can lead to problem.
> We are working in a Linux environment on CentOS distributions server.

So in a nutshell you're looking for a monitor application that will in
essence give you a green, yellow, or red light informing you of the
filesystem's health.  Or some kind of SNMP logging that suggests a
corruption is imminent.

There is no such tool, and never will be.  Nearly all XFS corruption
events are caused by either software bugs in the XFS code or elsewhere
in the Linux kernel, transient or permanent hardware failures, or power
failures, at some layer in the storage stack.  It is not feasible to
predict such events.

When an XFS corruption occurs, one should report all related log
information and errors to this list so that the problem may be analyzed
and the root cause identified.  Then the proper corrective action can be
identified and implemented to fix the problem and hopefully prevent it
from reoccurring.


<Prev in Thread] Current Thread [Next in Thread>