xfs
[Top] [All Lists]

Re: Corruption of root fs during git bisect of drm system hang

To: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Subject: Re: Corruption of root fs during git bisect of drm system hang
From: Markus Trippelsdorf <markus@xxxxxxxxxxxxxxx>
Date: Thu, 11 Jul 2013 13:28:26 +0200
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=simple; d=mail.ud10.udmedia.de; h= date:from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=beta; bh= p8Y3d2/VgfAo5fF4YOAmTM5JArRvZhFY0epTIFAafZ0=; b=bXuXvTVVx5DMw3TN 0vmYp0pzZlatM0H/QLrY5zSA9N+0qftm+LEdwjtuXrtvKQR//O10QVaU3T3chT4A gDXujB2ior/3SdANl6pZhXIrHpKMduiC6H1bz8zUN49v1O8RYsyik8RCnxbpKZka 5Z/ZcxpOc26l31A9FmOpiABZnPY=
In-reply-to: <20130711090755.GA363@x4>
References: <20130710090634.GA356@x4> <20130711003122.GR3438@dastard> <20130711033621.GB362@x4> <20130711035827.GA3438@dastard> <51DE30BC.1050905@xxxxxxxxxxxxxxxxx> <20130711090755.GA363@x4>
On 2013.07.11 at 11:07 +0200, Markus Trippelsdorf wrote:
> On 2013.07.10 at 23:12 -0500, Stan Hoeppner wrote:
> > On 7/10/2013 10:58 PM, Dave Chinner wrote:
> > > On Thu, Jul 11, 2013 at 05:36:21AM +0200, Markus Trippelsdorf wrote:
> > 
> > >> I was loosing my KDE settings bit by bit with every reboot during the
> > >> bisection. First my window-rules disappeared, then my desktop background
> > >> changed to default, then my taskbar moved from top to the bottom, etc.
> > >> In the end I had to restore all my .files from backup. 
> > > 
> > > That's not filesystem corruption. That sounds more like someone not
> > > using fsync in the apropriate place when overwriting a file....
> > 
> > From Sandeen's blog, March 2009:
> > 
> > "I dunno how to resolve this right now.  I talked to some nice KDE folks
> > on irc; they basically want atomic writes, either you get your old file
> > or your new file post-crash; and tempfile/sync/rename does this â but
> > the fsync hurts on 78% of the Linux filesystems out there.  So their
> > KSaveFile class doesnât fsync.  So what to do, what to do.."
> > 
> > That's 4 years ago.  Is it possible the KDE devs are still not using
> > fsync?  Sure seems likely given Markus' problem.
> 
> Looking at the source:
> http://api.kde.org/4.10-api/kdelibs-apidocs/kdecore/html/ksavefile_8cpp_source.html#l00219
> it appears that one can set an environment variable KDE_EXTRA_FSYNC to
> address this issue.
> 
> However in my case it doesn't help. Even with KDE_EXTRA_FSYNC=1 I still
> loose my KDE settings in case of a crash. So the whole fsync thing might
> be a red herring.

It turned out that the KDE_EXTRA_FSYNC variable doesn't affect KDE
config file handling at all.
So I've added an fsync in kconfigini.cpp (KConfigIniBackend::writeConfig)
and now I don't loose my settings anymore during kernel crash testing.

That is until xfs eats my KDE config files (kwinrulesr in this case):

root@ubunt:/home/markus# xfs_repair /dev/sdc2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
agi unlinked bucket 55 is 406711 in ag 3 (inode=101070007)
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...                                  
                                                              
        - process known inodes and perform inode discovery...                   
                                                             
        - agno = 0                                                              
                                                              
imap claims a free inode 858183 is in use, correcting imap and clearing inode   
                                                             
cleared inode 858183                                                            
                                                              
        - agno = 1                                                              
                                                              
imap claims a free inode 40112137 is in use, correcting imap and clearing inode 
                                                             
cleared inode 40112137                                                          
                                                              
imap claims a free inode 40112354 is in use, correcting imap and clearing inode 
                                                             
cleared inode 40112354                                                          
                                                              
        - agno = 2                                                              
                                                              
imap claims a free inode 68162927 is in use, correcting imap and clearing inode 
                                                             
cleared inode 68162927                                                          
                                                              
7f336f1b6700: Badness in key lookup (length)                                    
                                                              
bp=(bno 47086672, len 16384 bytes) key=(bno 47086672, len 8192 bytes)           
                                                             
        - agno = 3                                                              
                                                              
imap claims a free inode 100865109 is in use, correcting imap and clearing 
inode                                                              
cleared inode 100865109                                                         
                                                             
imap claims a free inode 101069993 is in use, correcting imap and clearing 
inode                                                              
cleared inode 101069993                                                         
                                                             
imap claims a free inode 101070010 is in use, correcting imap and clearing 
inode                                                              
cleared inode 101070010                                                         
                                                             
imap claims a free inode 101070015 is in use, correcting imap and clearing 
inode                                                              
cleared inode 101070015                                                         
                                                             
        - process newly discovered inodes...                                    
                                                              
Phase 4 - check for duplicate blocks...                                         
                                                             
        - setting up duplicate extent list...                                   
                                                             
        - check for inodes claiming duplicate blocks...                         
                                                             
        - agno = 0                                                              
                                                              
        - agno = 2                                                              
                                                              
        - agno = 1                                                              
                                                              
        - agno = 3                                                              
                                                              
entry "mytexts.bau" in shortform directory 67333623 references free inode 
68162927                                                            
junking entry "mytexts.bau" in directory inode 67333623                         
                                                             
entry "dialog.xlc" in shortform directory 252098 references free inode 858183   
                                                             
junking entry "dialog.xlc" in directory inode 252098                            
                                                              
entry "evolocal.odb" in shortform directory 100870253 references free inode 
100865109                                                        
junking entry "evolocal.odb" in directory inode 100870253                       
                                                             
entry "kwinrulesrc" at block 0 offset 2552 in directory inode 103698564 
references free inode 101070010                                      
        clearing inode number in entry at offset 2552...                        
                                                              
entry "kwinrulesrcbhc578.new" at block 0 offset 3224 in directory inode 
103698564 references free inode 101070015                            
        clearing inode number in entry at offset 3224...                        
                                                              
entry "Module1.xba" in shortform directory 40112359 references free inode 
40112354                                                            
junking entry "Module1.xba" in directory inode 40112359                         
                                                             
entry "script.xlb" in shortform directory 40112359 references free inode 
40112137                                                            
junking entry "script.xlb" in directory inode 40112359                          
                                                              
Phase 5 - rebuild AG headers and trees...                                       
                                                             
        - reset superblock...                                                   
                                                             
Phase 6 - check inode connectivity...                                           
                                                             
        - resetting contents of realtime bitmap and summary inodes              
                                                              
        - traversing filesystem ...                                             
                                                             
bad hash table for directory inode 103698564 (no data entry): rebuilding        
                                                              
rebuilding directory inode 103698564                                            
                                                              
        - traversal finished ...                                                
                                                              
        - moving disconnected inodes to lost+found ...                          
                                                              
disconnected inode 101070007, moving to lost+found                              
                                                              
Phase 7 - verify and correct link counts...                                     
                                                             
cache_purge: shake on cache 0x1bc6030 left 1 nodes!?                            
                                                              
done   

-- 
Markus

<Prev in Thread] Current Thread [Next in Thread>