xfs_repair of critical volume

Eli Morris ermorris at ucsc.edu
Fri Dec 3 18:43:08 CST 2010


On Dec 2, 2010, at 3:33 AM, Michael Monnerie wrote:

> On Dienstag, 30. November 2010 Eli Morris wrote:
>> Thanks for your help with this. I wrote the program and ran it
>> through and it looks like we have we able to preserve 44 TB of valid
>> data, while removing the corrupted files, which is a great result,
>> considering the circumstances. 
> 
> Eli, could you post the relevant program here so others can use it if 
> needed? There are requests from time to time, and it would be good if 
> such a program were available (like I'm sure you'd been happy if it 
> already existed the time you needed it).
> 
> Thanks, and wow: what an amazing filesystem can recover such an event!
> 
> -- 
> mit freundlichen Grüssen,
> Michael Monnerie, Ing. BSc
> 
> it-management Internet Services: Protéger
> http://proteger.at [gesprochen: Prot-e-schee]
> Tel: +43 660 / 415 6531
> 
> // ****** Radiointerview zum Thema Spam ******
> // http://www.it-podcast.at/archiv.html#podcast-100716
> // 
> // Haus zu verkaufen: http://zmi.at/langegg/


Good idea, here is the program:

Eli

#!/bin/bash
# 
#    Copyright 2010 Eli Morris, Travis O'Brien, University of California 
# 
#    remove_bad.sh is free software: you can redistribute it under the  terms
#    of the GNU General Public License as published by the Free Software
#    Foundation, either version 3 of the License, or (at your option) any later
#    version. 
# 
#    This program is distributed in the hope that it will be useful, but
#    WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
#    or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
#    for more details. 
# 
#    You should have received a copy of the GNU General Public License along
#    with this program.  If not, see <http://www.gnu.org/licenses/>. 
#
#
#remove_bad.sh: A script to determine whether any part of a file falls within a
#set of blocks (indicated by arguments 1 and 2).  This script is
#originally written with the intent to find files on a file system that
#exist(ed) on a corrupt section of the file system.  It generates a list of files
#that are potentially bad, so that they can be removed by another script.
#

#Check command line arguments; grab arguments 1 and 2
if [ $# -eq 2 ]; then
	BAD_BLOCK_BEGINNING=$1
	BAD_BLOCK_END=$2
	echo "bad block beginning $BAD_BLOCK_BEGINNING"
	echo "bad block ending $BAD_BLOCK_END"
#if there aren't exactly 2 arguments then print the usage to the user
else
	echo "usage: remove_bad.sh beginning_block ending_block"
	exit
fi

remove file from last run
if ( test -e "./naughty_list.txt") 
then
	echo "removing the previous naughty list"
	rm "./naughty_list.txt"
fi

IFS=$'\n' #set the field separator to the carriage return character
ALL_FILES=(`find /export/vol5 -type f`) #A list of all files on the volume, SUBSTITUTE NAME OF YOUR VOLUME
NUM_FILES=${#ALL_FILES[@]} #The number of files on the volume
echo "number of files is $NUM_FILES" #Report the number of files to the user

# for each of the file in vol5
for (( COUNT=0; COUNT<$NUM_FILES; COUNT++))
do
    	#Report which file is being worked on
	echo "file number: $COUNT is ${ALL_FILES[$COUNT]}"

	# report number of files to go
	FILES_TO_GO=$((NUM_FILES-COUNT))
	echo "files left: $FILES_TO_GO" 

    	#Run xfs_bmap to get the blocks that the file lives within
	OUTPUT=(`xfs_bmap ${ALL_FILES[$COUNT]}`)
	# output looks like this
	# vol5dump:
	# 0: [0..1053271]: 5200578944..5201632215

	BAD_FILE=0 #Initialize the bad file flag
	NUM_LINES=${#OUTPUT[@]} #The number of lines from xfs_bmap

	# echo "number of lines for file: $NUM_LINES" #Report the number of lines to the user
    	#Loop through each line
	for (( LINE=1; LINE < $NUM_LINES; LINE++))
	do
		# echo "line number $LINE: output: ${OUTPUT[$LINE]}" #Report the current working line

		# get the block range from the line
		BLOCKS=`echo ${OUTPUT[$LINE]} | cut -d':' -f3`

       	 	#Report the number of blocks occupied
		# echo "blocks after cut: '$BLOCKS'" 
        	#Use cut to get the first and last block for the file
		FIRST_BLOCK=`echo $BLOCKS | cut -d'.' -f1` 
		LAST_BLOCK=`echo $BLOCKS | cut -d'.' -f3`
		
        	#Report these to the user
		# echo "beginning block: $FIRST_BLOCK"
		# echo "ending block: $LAST_BLOCK"

		#TODO: I'm not sure what exactly 'hole' means, but I get the impression that it has something
		#to do with XFS's way of avoiding file fragmentation. TAO
		if [ "$BLOCKS" != " hole" ]; then  #Don't deal with lines that report 'hole'
			# compare to bad block region
			#For now, check whether the blocks for the file fall within the user-given block range
			#if any of the blocks do, then mark this file as bad.

		  	if ( (( "$BAD_BLOCK_BEGINNING" <= "$FIRST_BLOCK")) && (( "$FIRST_BLOCK" <= "$BAD_BLOCK_END")) ); then
				  # echo "hit first criterium"
				  BAD_FILE=1
				  break
		  	elif ( (( "$BAD_BLOCK_BEGINNING" <= "$LAST_BLOCK")) && (( "$LAST_BLOCK" <= "$BAD_BLOCK_END")) ); then
				  # echo "hit second criterium"
				  BAD_FILE=1
				  break
		  	fi
		fi
	done
	# add the file to the list of bad files
	if (($BAD_FILE == 1)); then
                #Report to the user that the current file is bad
		echo "putting file: ${ALL_FILES[$COUNT]} on the naughty list"
                #Write the file's name to the list
		echo "${ALL_FILES[$COUNT]}" >> naughty_list.txt
	fi
done
echo "program_ended_succesfully" >> naughty_list.txt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20101203/aa132ddc/attachment.htm>


More information about the xfs mailing list