[Top] [All Lists]

Re: xfs_repair of critical volume

To: Michael Monnerie <michael.monnerie@xxxxxxxxxxxxxxxxxxx>
Subject: Re: xfs_repair of critical volume
From: Eli Morris <ermorris@xxxxxxxx>
Date: Fri, 3 Dec 2010 16:43:08 -0800
Cc: xfs@xxxxxxxxxxx, Dave Chinner <david@xxxxxxxxxxxxx>
In-reply-to: <201012021233.07213@xxxxxx>
References: <75C248E3-2C99-426E-AE7D-9EC543726796@xxxxxxxx> <20101117074708.GP22876@dastard> <BA70F631-15CB-4E52-913A-3715CC9678A0@xxxxxxxx> <201012021233.07213@xxxxxx>

On Dec 2, 2010, at 3:33 AM, Michael Monnerie wrote:

On Dienstag, 30. November 2010 Eli Morris wrote:
Thanks for your help with this. I wrote the program and ran it
through and it looks like we have we able to preserve 44 TB of valid
data, while removing the corrupted files, which is a great result,
considering the circumstances.

Eli, could you post the relevant program here so others can use it if
needed? There are requests from time to time, and it would be good if
such a program were available (like I'm sure you'd been happy if it
already existed the time you needed it).

Thanks, and wow: what an amazing filesystem can recover such an event!

mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531

// ****** Radiointerview zum Thema Spam ******
// http://www.it-podcast.at/archiv.html#podcast-100716
// Haus zu verkaufen: http://zmi.at/langegg/

Good idea, here is the program:


#    Copyright 2010 Eli Morris, Travis O'Brien, University of California 
#    remove_bad.sh is free software: you can redistribute it under the  terms
#    of the GNU General Public License as published by the Free Software
#    Foundation, either version 3 of the License, or (at your option) any later
#    version. 
#    This program is distributed in the hope that it will be useful, but
#    WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
#    or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
#    for more details. 
#    You should have received a copy of the GNU General Public License along
#    with this program.  If not, see <http://www.gnu.org/licenses/>. 
#remove_bad.sh: A script to determine whether any part of a file falls within a
#set of blocks (indicated by arguments 1 and 2).  This script is
#originally written with the intent to find files on a file system that
#exist(ed) on a corrupt section of the file system.  It generates a list of files
#that are potentially bad, so that they can be removed by another script.

#Check command line arguments; grab arguments 1 and 2
if [ $# -eq 2 ]; then
echo "bad block beginning $BAD_BLOCK_BEGINNING"
echo "bad block ending $BAD_BLOCK_END"
#if there aren't exactly 2 arguments then print the usage to the user
echo "usage: remove_bad.sh beginning_block ending_block"

remove file from last run
if ( test -e "./naughty_list.txt"
echo "removing the previous naughty list"
rm "./naughty_list.txt"

IFS=$'\n' #set the field separator to the carriage return character
ALL_FILES=(`find /export/vol5 -type f`) #A list of all files on the volume, SUBSTITUTE NAME OF YOUR VOLUME
NUM_FILES=${#ALL_FILES[@]} #The number of files on the volume
echo "number of files is $NUM_FILES" #Report the number of files to the user

# for each of the file in vol5
    #Report which file is being worked on
echo "file number: $COUNT is ${ALL_FILES[$COUNT]}"

# report number of files to go
echo "files left: $FILES_TO_GO" 

    #Run xfs_bmap to get the blocks that the file lives within
OUTPUT=(`xfs_bmap ${ALL_FILES[$COUNT]}`)
# output looks like this
# vol5dump:
# 0: [0..1053271]: 5200578944..5201632215

BAD_FILE=0 #Initialize the bad file flag
NUM_LINES=${#OUTPUT[@]} #The number of lines from xfs_bmap

# echo "number of lines for file: $NUM_LINES" #Report the number of lines to the user
    #Loop through each line
for (( LINE=1; LINE < $NUM_LINES; LINE++))
# echo "line number $LINE: output: ${OUTPUT[$LINE]}" #Report the current working line

# get the block range from the line
BLOCKS=`echo ${OUTPUT[$LINE]} | cut -d':' -f3`

        #Report the number of blocks occupied
# echo "blocks after cut: '$BLOCKS'" 
        #Use cut to get the first and last block for the file
FIRST_BLOCK=`echo $BLOCKS | cut -d'.' -f1` 
LAST_BLOCK=`echo $BLOCKS | cut -d'.' -f3`

        #Report these to the user
# echo "beginning block: $FIRST_BLOCK"
# echo "ending block: $LAST_BLOCK"

#TODO: I'm not sure what exactly 'hole' means, but I get the impression that it has something
#to do with XFS's way of avoiding file fragmentation. TAO
if [ "$BLOCKS" != " hole" ]; then  #Don't deal with lines that report 'hole'
# compare to bad block region
#For now, check whether the blocks for the file fall within the user-given block range
#if any of the blocks do, then mark this file as bad.

  if ( (( "$BAD_BLOCK_BEGINNING" <= "$FIRST_BLOCK")) && (( "$FIRST_BLOCK" <= "$BAD_BLOCK_END")) ); then
  # echo "hit first criterium"
  elif ( (( "$BAD_BLOCK_BEGINNING" <= "$LAST_BLOCK")) && (( "$LAST_BLOCK" <= "$BAD_BLOCK_END")) ); then
  # echo "hit second criterium"
# add the file to the list of bad files
if (($BAD_FILE == 1)); then
                #Report to the user that the current file is bad
echo "putting file: ${ALL_FILES[$COUNT]} on the naughty list"
                #Write the file's name to the list
echo "${ALL_FILES[$COUNT]}" >> naughty_list.txt
echo "program_ended_succesfully" >> naughty_list.txt

<Prev in Thread] Current Thread [Next in Thread>