xfs-masters
[Top] [All Lists]

[Bug 906] Log corruption during mount.

To: xfs-masters@xxxxxxxxxxx
Subject: [Bug 906] Log corruption during mount.
From: bugzilla-daemon@xxxxxxxxxxx
Date: Mon, 7 Feb 2011 14:55:40 -0600
Auto-submitted: auto-generated
In-reply-to: <bug-906-113@xxxxxxxxxxxxxxxx/bugzilla/>
References: <bug-906-113@xxxxxxxxxxxxxxxx/bugzilla/>
http://oss.sgi.com/bugzilla/show_bug.cgi?id=906





--- Comment #3 from Dave Chinner <david@xxxxxxxxxxxxx>  2011-02-07 14:55:38 CST 
---
(In reply to comment #2)
> Hi Dave,
> 
> Thank you for your reply.
> 
> Yes, we have to accept this unclean shutdown because this is used for PVR STB
> for CE. Unfortunately, we can’t make it clean shutdown because user can turn
> off the power and we can’t detect it.

That's not what I asked. What I asked is how did _this specific_ unclean
shutdown occur?

> But I think any kind of unclean shutdown
> should not cause this issue unless the journaling feature is used. So I try to
> find out reason of this issue.
> 
> Do I need to add “barrier” in option during mounting? Isn’t it default?

Yes, it's the default option, but it gets turned off if your hardware doesn't
support it. I don't recall exactly where it is tested in the mount path, so
can you confirm if there are any messages in your syslog indicating barriers
have been disabledi during a successful mount?

> I tried to mount it with latest kernel(2.6.36) and this HDD was mounted
> properly. But I can’t upgrade the kernel for this issue because lot of things
> have dependency with kernel.

Which, to me, indicates that 2.6.36 handles the vmap cache aliasing correctly
and that the problem is specific to your kernel. These fixes:

73c77e2 xfs: fix xfs to work with Virtually Indexed architectures
c9334f6 sh: add mm API for DMA to vmalloc/vmap areas
252a9af arm: add mm API for DMA to vmalloc/vmap areas
ef7cc35 parisc: add mm API for DMA to vmalloc/vmap areas
9df5f741 mm: add coherence API for DMA to vmalloc/vmap areas

went into 2.6.33-rc3 to fix such problems.....

> Anyway, Can I make this code to return “EFSCORRUPT” instead of assert?
> 
>     /*
>      * If we went off the root then we are seriously confused.
>      */
>     retrun EFSCORRUPT;
>     //ASSERT(lev < cur->bc_nlevels);

Are you compiling with CONFIG_XFS_DEBUG=y? I don't think you are, because the
dmesg output you posted does not have an assertion failure output in it.

Anyway, I think that would simply be putting a band-aid over the specific
issue, not solving the root cause of your problem. Hence you'll just get
problems somewhere else during recovery....

Cheers,

Dave.

-- 
Configure bugmail: http://oss.sgi.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
<Prev in Thread] Current Thread [Next in Thread>