xfs
[Top] [All Lists]

Re: XFS: Internal error XFS_WANT_CORRUPTED_RETURN

To: Dave Jones <davej@xxxxxxxxxx>
Subject: Re: XFS: Internal error XFS_WANT_CORRUPTED_RETURN
From: Chris Murphy <lists@xxxxxxxxxxxxxxxxx>
Date: Wed, 11 Dec 2013 17:19:52 -0700
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20131211185746.GA11861@xxxxxxxxxx>
References: <20131211172725.GA4606@xxxxxxxxxx> <68DD7157-6ACE-4548-A466-C1EBD31B6DEB@xxxxxxxxxxxxxxxxx> <20131211185746.GA11861@xxxxxxxxxx>
On Dec 11, 2013, at 11:57 AM, Dave Jones <davej@xxxxxxxxxx> wrote:

> On Wed, Dec 11, 2013 at 11:52:51AM -0700, Chris Murphy wrote:
>> 
>> On Dec 11, 2013, at 10:27 AM, Dave Jones <davej@xxxxxxxxxx> wrote:
>>> 
>>> Thoughts ? Could sda be dying ? (It is a fairly old crappy ssd)
>> 
>> It may reveal nothing useful, but please report the results from 'smartctl 
>> -x /dev/sda' and if not found install smartmontools package.
> 
> 
> I meant it when I said 'old' and 'crappy'.
> It doesn't even support the interesting SMART commands.


Oh well, was worth a shot. The Available_Reservd_Space and 
Media_Wearout_Indicator could be useful, but I don't know how trustworthy they 
are when both say they're at 100 which is normally where these values start. 
Yet they have high, and without reference, meaningless, raw values. The 
Available_Reservd_Space value is currently 100 but its worst value was 48 which 
is sorta interesting that it dipped down at one point. That seems to imply it 
gave up some reserved sectors. I'd expect that once replaced that'd be it, and 
this value should only go down.

I suspect we've only just begun to see the myriad ways in which SSDs could 
fail. I ran across this article earlier today:
http://techreport.com/review/25681/the-ssd-endurance-experiment-testing-data-retention-at-300tb

What I thought was eye opening was a hashed file failing multiple times in a 
row with *different* hash values, being allowed to rest unpowered for five days 
and then passing. Eeek. Talk about a great setup for a lot of weird transient 
problems with that kind of reversal. What I can't tell is if there were read 
errors report to the SATA driver, or if (different) bad data from a particular 
page was sent to the driver.

Chris Murphy
<Prev in Thread] Current Thread [Next in Thread>