xfs
[Top] [All Lists]

RE: segmentation fault during mount

To: "'Eric Sandeen'" <sandeen@xxxxxxxxxxx>
Subject: RE: segmentation fault during mount
From: Ryan Roh <unisist.roh@xxxxxxxxxxx>
Date: Tue, 08 Feb 2011 15:06:27 +0900
Cc: xfs@xxxxxxxxxxx
In-reply-to: <4D50D748.1090406@xxxxxxxxxxx>
References: <02b401cbc6b6$5b7ac0b0$12704210$@samsung.com> <4D50CA45.8070807@xxxxxxxxxxx> <030701cbc74e$b7f25680$27d70380$@samsung.com> <4D50D748.1090406@xxxxxxxxxxx>
Thread-index: AQFP38Q1HCdmiHcoKY6r42jo2gLBIQG9KPkDAp0KbkMBlUHA9ZS/bVRQ
Dear Eric,

Thank you for kind reply.

The XFS partition can be mounted after xfs_repair with -L option. 
Actually, the debug option was turned off. So the assert is not called.
Anyway the level was equal to root level of b-tree. So if I change the code
like in the below then XFS mount display the message to repair partition
with xfs_repair.

   /*
     * If we went off the root then we are seriously confused.
     */
    If (lev < cur->bc_nlevels)
        return EFSCORRUPT;
    //ASSERT(lev < cur->bc_nlevels);


Kernel oops can be replayed with metadump and restored image. But when I
tested it with 2.6.33.6 (FC13) then mount failed with " mount: Structure
needs cleaning" message.
And Would you please let me know how I can share the metadump file with
others? It is too big to send through the e-mail. Can I use the FTP server
to share it?

And I got the hint about the patch for vmap cache aliasing issue from Dave
Chinner and I trying to apply it. 
"[GIT PATCH] Fix XFS to work with Virtually indexed architectures" :
http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-02/msg10227.html

Thanks,
Ryan.


-----Original Message-----
From: Eric Sandeen [mailto:sandeen@xxxxxxxxxxx] 
Sent: Tuesday, February 08, 2011 2:40 PM
To: Ryan Roh
Cc: xfs@xxxxxxxxxxx
Subject: Re: segmentation fault during mount

On 2/7/11 11:12 PM, Ryan Roh wrote:
> Dear Eric,
> 
> I don't know how I can make correct form to answer for this thread 
> because I'm newbie here. Sorry.
> 
> Anyway, this issue was happened from returned HDD from customer which 
> was used our PVR STB. And our STB has toggle power switch so I think 
> user turned off the power during recording something.

Ok, so you're not sure what happened to the hard drive before this, then.

Other Samsung folks have reported problems after intentionally testing the
filesystem under harsh conditions such as poweroff or USB unplugs, so I just
wondered...

It seems plausible to me that this could be corruption from lack of proper
barrier support, and a poweroff or usb unplug (without barrier support)
could cause that.

Mounting a corrupted filesystem should never oops the kernel though, so that
is a bug.  If you can provide an xfs_metadump image of the filesystem,
someone might be able to investigate further.

Does the mount failure persist after an xfs_repair (without using -n?)

If you wish to keep the original filesystem intact, you can make an
xfs_metadump image of the filesystem, run xfs_mdrestore to create a new
metadata image from that dump, run xfs_repair against that, and try to mount
the result.

Does samsung run with CONFIG_XFS_DEBUG enabled?  Otherwise, this:

    /*
     * If we went off the root then we are seriously confused.
     */

    ASSERT(lev < cur->bc_nlevels);

would be a no-op:

#ifndef DEBUG
#define ASSERT(expr)    ((void)0)
...

(As a side note, running with CONFIG_XFS_DEBUG in production is not
recommended.)

However, I'm not quite sure that's what you are hitting, if you tripped an
ASSERT you should have seen "Assertion failed" in the messages.  This
appears to be a null pointer dereference in xfs_free_ag_extent().

-Eric


> Thanks,
> Ryan.
>   
> 
> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen@xxxxxxxxxxx]
> Sent: Tuesday, February 08, 2011 1:45 PM
> To: Ryan Roh
> Cc: xfs@xxxxxxxxxxx
> Subject: Re: segmentation fault during mount
> 
> On 2/7/11 5:01 AM, Ryan Roh wrote:
>> Dear Members,
>>
>> I'm using XFS based on STMicro SH4 based chip (STi7105).
>>
>> and I have some issue on xfs log mounting.
>>
> 
> Were the errors after any sort of harsh testing of the filesystem, 
> such as usb disconnects or power off?
> 
> Or was this after a clean unmount?
> 
> -Eric
> 
>>
>> 1. chip : sh4 STi7105
>>
>> 2. HDD : 320GB USB HDD USB 2.0 port.
>>
>> 3. OS : Linux 2.6.23.17 + patch for fixing cache aliasing issue.
>>
>> 4. XFSProgs version : 3.1.1
>>
>>  
>>
>> mounting and repairing log in the below. This segmentation fault is 
>> caused by
>>
>> the assert in xfs_alloc_increment function of xfs_alloc_btree.c file. 
>> The btree
>>
>> level is equal to root level in the below code.
>>
>>  
>>
>>     /*
>>
>>      * If we went off the root then we are seriously confused.
>>
>>      */
>>
>>     ASSERT(lev < cur->bc_nlevels);
>>
>>  
> 
> ...
> 

<Prev in Thread] Current Thread [Next in Thread>