xfs
[Top] [All Lists]

Re: Corruption of root fs during git bisect of drm system hang

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Corruption of root fs during git bisect of drm system hang
From: Markus Trippelsdorf <markus@xxxxxxxxxxxxxxx>
Date: Fri, 12 Jul 2013 09:07:21 +0200
Cc: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=simple; d=mail.ud10.udmedia.de; h= date:from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=beta; bh= ZGc/dSylc9nL+B7O4Ta9GMYVb603xfaQIT/tE+aMzGM=; b=bcIDbQ6Ogo1NEZho h6TL8mDsG4jpuBXVI6nIKlpQc6MvECjWtHO14koHAo1hbKq+1+W+6NnCUF2Igh4Z SvqyztuMhIvNsYjgeYcN0jIsJJX9QD4zfU/Rx1p0K+4rQjnGpzpL0v6+g2FEhbGm pQZ4PaWKIs8bOkpTE3X8VpOpdsI=
In-reply-to: <20130712021737.GA5228@dastard>
References: <20130710090634.GA356@x4> <20130711003122.GR3438@dastard> <20130711033621.GB362@x4> <20130711035827.GA3438@dastard> <51DE30BC.1050905@xxxxxxxxxxxxxxxxx> <20130711090755.GA363@x4> <20130712021737.GA5228@dastard>
On 2013.07.12 at 12:17 +1000, Dave Chinner wrote:
> On Thu, Jul 11, 2013 at 11:07:55AM +0200, Markus Trippelsdorf wrote:
> > On 2013.07.10 at 23:12 -0500, Stan Hoeppner wrote:
> > > On 7/10/2013 10:58 PM, Dave Chinner wrote:
> > > > On Thu, Jul 11, 2013 at 05:36:21AM +0200, Markus Trippelsdorf wrote:
> > > 
> > > >> I was loosing my KDE settings bit by bit with every reboot during the
> > > >> bisection. First my window-rules disappeared, then my desktop 
> > > >> background
> > > >> changed to default, then my taskbar moved from top to the bottom, etc.
> > > >> In the end I had to restore all my .files from backup. 
> > > > 
> > > > That's not filesystem corruption. That sounds more like someone not
> > > > using fsync in the apropriate place when overwriting a file....
> > > 
> > > From Sandeen's blog, March 2009:
> > > 
> > > "I dunno how to resolve this right now.  I talked to some nice KDE folks
> > > on irc; they basically want atomic writes, either you get your old file
> > > or your new file post-crash; and tempfile/sync/rename does this â but
> > > the fsync hurts on 78% of the Linux filesystems out there.  So their
> > > KSaveFile class doesnât fsync.  So what to do, what to do.."
> > > 
> > > That's 4 years ago.  Is it possible the KDE devs are still not using
> > > fsync?  Sure seems likely given Markus' problem.
> > 
> > Looking at the source:
> > http://api.kde.org/4.10-api/kdelibs-apidocs/kdecore/html/ksavefile_8cpp_source.html#l00219
> > it appears that one can set an environment variable KDE_EXTRA_FSYNC to
> > address this issue.
> > 
> > However in my case it doesn't help. Even with KDE_EXTRA_FSYNC=1 I still
> > loose my KDE settings in case of a crash. So the whole fsync thing might
> > be a red herring.
> > 
> > What's more this time I endend up with undeletable files in /tmp (for
> > example .X0-lock) after the crash:
> > 
> > (/dev/sdb was mounted and unmounted normally before I ran xfs_repair)
> > 
> > t@ubunt:~# xfs_repair /dev/sdb
> > Phase 1 - find and verify superblock...
> > Phase 2 - using internal log
> >         - zero log...
> >         - scan filesystem freespace and inode maps...
> > agi unlinked bucket 0 is 683435008 in ag 2 (inode=4978402304)
> > agi unlinked bucket 1 is 683435009 in ag 2 (inode=4978402305)
> >         - found root inode chunk
> 
> Again, these are signs that log recovery has not completed
> successfully or that for some reason it thought the log was clean.
> Can you please post the dmesg output after the crash when you go
> through the mount/unmount process before you run xfs_repair?

Sure.
First boot after crash:
 XFS (sdb2): Mounting Filesystem
 XFS (sdb2): Starting recovery (logdev: internal)
 XFS (sdb2): Ending recovery (logdev: internal)

Second boot after crash:
 XFS (sdb2): Mounting Filesystem
 XFS (sdb2): Ending clean mount 

I then boot Ubuntu from another disc to run xfs_repair.

And looking through my logs I see this WARNING:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 439 at fs/inode.c:280 drop_nlink+0x33/0x40()
CPU: 0 PID: 439 Comm: gconfd-2 Not tainted 3.10.0-08982-g6d128e1-dirty #42
Hardware name: System manufacturer System Product Name/M4A78T-E, BIOS 3503    
04/13/2011
 0000000000000009 ffffffff8157d030 0000000000000000 ffffffff81060788
 ffff8801f8608cc8 ffff880205998230 ffff8801f7bede58 0000000000000000
 ffff8801f86083c0 ffffffff8110ce93 ffff8801f8608b40 ffffffff811b7104
Call Trace:
 [<ffffffff8157d030>] ? dump_stack+0x41/0x51
 [<ffffffff81060788>] ? warn_slowpath_common+0x68/0x80
 [<ffffffff8110ce93>] ? drop_nlink+0x33/0x40
 [<ffffffff811b7104>] ? xfs_droplink+0x24/0x60
 [<ffffffff811b84ed>] ? xfs_remove+0x24d/0x380
 [<ffffffff811b1657>] ? xfs_vn_unlink+0x37/0x80
 [<ffffffff8110414e>] ? vfs_unlink+0x6e/0xe0
 [<ffffffff8110432a>] ? do_unlinkat+0x16a/0x220
 [<ffffffff810f4fa9>] ? SyS_faccessat+0x149/0x200
 [<ffffffff81583292>] ? system_call_fastpath+0x16/0x1b
---[ end trace de5865b7c20ab8e4 ]---

-- 
Markus

<Prev in Thread] Current Thread [Next in Thread>