| To: | Julien FERRERO <jferrero06@xxxxxxxxx> |
|---|---|
| Subject: | Re: XFS filesystem corruption |
| From: | Ric Wheeler <rwheeler@xxxxxxxxxx> |
| Date: | Wed, 06 Mar 2013 11:47:39 -0500 |
| Cc: | Emmanuel Florac <eflorac@xxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx |
| Delivered-to: | xfs@xxxxxxxxxxx |
| In-reply-to: | <CAPcwv6wqv0b_CPqDpBfOwVDg23uBi=tpGQSy9XuH2uWS5oVMWQ@xxxxxxxxxxxxxx> |
| References: | <CAPcwv6wZJSBtgF-L6KNSn6N6Y+wUZJFXdbcg+zYRwoaB2sDdjw@xxxxxxxxxxxxxx> <20130306161519.2c28d911@xxxxxxxxxxxxxx> <CAPcwv6wqv0b_CPqDpBfOwVDg23uBi=tpGQSy9XuH2uWS5oVMWQ@xxxxxxxxxxxxxx> |
| User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130219 Thunderbird/17.0.3 |
On 03/06/2013 11:16 AM, Julien FERRERO wrote: Hi Emmanuel 2013/3/6 Emmanuel Florac <eflorac@xxxxxxxxxxxxxx>:Le Wed, 6 Mar 2013 16:08:59 +0100 vous écriviez:I am totally stuck and I really don't know how to duplicate the corruption. I only know that units are used to be power cycle by operator while the fs is still mounted (no proper shutdown / reboot). My guess is the fs journal shall handle this case and avoid such corruption.Wrong guess. It may work or not, depending upon a long list of parameters, but basically not turning it off properly is asking for problems and corruptions. The problem will be tragically aggravated if your hardware RAID doesn't have a battery backed-up cache.OK but our server is 95% of the time reading data and 5% of the time writing data. We have a case of a server that did not write anything at the time of failure (and during all the uptime session). Moreover, failure occurs to files that were opened in read-only or weren't accessed at all at the time of failure. I don't think the H/W RAID is the issue since we have the same corruption with other setup without H/W RAID. Does the "ls" output with "???" looks like a fs corruption ? Caching can hold dirty data in volatile cache for a very long time. Even if you open a file in "read-only" mode, you still do a fair amount of writes to storage. You can use blktrace or similar tool to see just how much data is written. As mentioned earlier, you always must unmount cleanly as a best practice. An operator that powers off with mounted file systems need educated or let go :) Ric |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: [PATCH] xfstests: don't assume that falloc_punch implies falloc in test 255, Zheng Liu |
|---|---|
| Next by Date: | Re: [PATCH] xfstests: enhance ltp/fsx with a timeout option, Rich Johnston |
| Previous by Thread: | Re: XFS filesystem corruption, Julien FERRERO |
| Next by Thread: | Re: XFS filesystem corruption, Emmanuel Florac |
| Indexes: | [Date] [Thread] [Top] [All Lists] |