Am Freitag 23 Juni 2006 09:12 schrieben Sie:
> Hello,
>
> sorry for posting last mail twice. I thought I entered the wrong
> mailing list address (xfs@xxxxxxxxxxx) first as I read a mailing list
> mail with linux-xfs@xxxxxxxxxxx instead.
Hello again,
not that I am too keen replying to myself, but the story got a new
chapter:
I had another xfs corruption. I was using 2.6.17.1 and rebooting into
2.6.16.11 once to report whether the "dma_intr" message I talked about in
my last mail will be shown in the log with it (see
http://bugzilla.kernel.org/show_bug.cgi?id=6737). All was with
writecaches disabled. This kernel bug report also contains all the
relevant kernel configurations.
I had no kernel crash this time, no suspend failure, no nothing, except:
When I rebooted to 2.6.17.1 and as KDE was closed KMail and KWallet
crashed. As I had that before when I had a filesystem crash, I booted
into SUSE 10.1 and checked the debian root partition:
It again had errors:
deepdance:~ # xfs_check /dev/hda5
bad free block nvalid/nused 4/-1 for dir ino 1747843 block 16777216
missing free index for data block 0 in dir ino 1747843
missing free index for data block 1 in dir ino 1747843
missing free index for data block 2 in dir ino 1747843
missing free index for data block 3 in dir ino 1747843
bad free block nvalid/nused 7/-1 for dir ino 5012689 block 16777216
missing free index for data block 0 in dir ino 5012689
missing free index for data block 1 in dir ino 5012689
missing free index for data block 2 in dir ino 5012689
missing free index for data block 3 in dir ino 5012689
missing free index for data block 4 in dir ino 5012689
missing free index for data block 5 in dir ino 5012689
missing free index for data block 6 in dir ino 5012689
bad free block nvalid/nused 8/-1 for dir ino 30448144 block 16777216
missing free index for data block 0 in dir ino 30448144
missing free index for data block 1 in dir ino 30448144
missing free index for data block 2 in dir ino 30448144
missing free index for data block 3 in dir ino 30448144
missing free index for data block 4 in dir ino 30448144
missing free index for data block 5 in dir ino 30448144
missing free index for data block 6 in dir ino 30448144
missing free index for data block 7 in dir ino 30448144
bad free block nvalid/nused 21/-1 for dir ino 33641428 block 16777216
missing free index for data block 0 in dir ino 33641428
missing free index for data block 1 in dir ino 33641428
missing free index for data block 2 in dir ino 33641428
missing free index for data block 3 in dir ino 33641428
missing free index for data block 4 in dir ino 33641428
missing free index for data block 5 in dir ino 33641428
missing free index for data block 6 in dir ino 33641428
missing free index for data block 7 in dir ino 33641428
missing free index for data block 8 in dir ino 33641428
missing free index for data block 9 in dir ino 33641428
missing free index for data block 10 in dir ino 33641428
missing free index for data block 11 in dir ino 33641428
missing free index for data block 12 in dir ino 33641428
missing free index for data block 13 in dir ino 33641428
missing free index for data block 14 in dir ino 33641428
missing free index for data block 15 in dir ino 33641428
missing free index for data block 16 in dir ino 33641428
missing free index for data block 17 in dir ino 33641428
missing free index for data block 18 in dir ino 33641428
missing free index for data block 19 in dir ino 33641428
missing free index for data block 20 in dir ino 33641428
bad free block nvalid/nused 26/-1 for dir ino 42681258 block 16777216
missing free index for data block 0 in dir ino 42681258
missing free index for data block 1 in dir ino 42681258
missing free index for data block 2 in dir ino 42681258
missing free index for data block 3 in dir ino 42681258
missing free index for data block 4 in dir ino 42681258
missing free index for data block 5 in dir ino 42681258
missing free index for data block 6 in dir ino 42681258
missing free index for data block 7 in dir ino 42681258
missing free index for data block 8 in dir ino 42681258
missing free index for data block 9 in dir ino 42681258
missing free index for data block 10 in dir ino 42681258
missing free index for data block 11 in dir ino 42681258
missing free index for data block 12 in dir ino 42681258
missing free index for data block 13 in dir ino 42681258
missing free index for data block 14 in dir ino 42681258
missing free index for data block 15 in dir ino 42681258
missing free index for data block 19 in dir ino 42681258
missing free index for data block 22 in dir ino 42681258
missing free index for data block 23 in dir ino 42681258
missing free index for data block 24 in dir ino 42681258
missing free index for data block 25 in dir ino 42681258
bad free block nvalid/nused 25/-1 for dir ino 46142796 block 16777216
missing free index for data block 0 in dir ino 46142796
missing free index for data block 1 in dir ino 46142796
missing free index for data block 2 in dir ino 46142796
missing free index for data block 3 in dir ino 46142796
missing free index for data block 4 in dir ino 46142796
missing free index for data block 5 in dir ino 46142796
missing free index for data block 6 in dir ino 46142796
missing free index for data block 7 in dir ino 46142796
missing free index for data block 8 in dir ino 46142796
missing free index for data block 9 in dir ino 46142796
missing free index for data block 10 in dir ino 46142796
missing free index for data block 11 in dir ino 46142796
missing free index for data block 12 in dir ino 46142796
missing free index for data block 13 in dir ino 46142796
missing free index for data block 14 in dir ino 46142796
missing free index for data block 15 in dir ino 46142796
missing free index for data block 16 in dir ino 46142796
missing free index for data block 17 in dir ino 46142796
missing free index for data block 18 in dir ino 46142796
missing free index for data block 19 in dir ino 46142796
missing free index for data block 20 in dir ino 46142796
missing free index for data block 21 in dir ino 46142796
missing free index for data block 22 in dir ino 46142796
missing free index for data block 23 in dir ino 46142796
missing free index for data block 24 in dir ino 46142796
bad free block nvalid/nused 65/-1 for dir ino 55176185 block 16777216
missing free index for data block 0 in dir ino 55176185
missing free index for data block 1 in dir ino 55176185
missing free index for data block 2 in dir ino 55176185
missing free index for data block 3 in dir ino 55176185
missing free index for data block 4 in dir ino 55176185
missing free index for data block 5 in dir ino 55176185
missing free index for data block 6 in dir ino 55176185
missing free index for data block 7 in dir ino 55176185
missing free index for data block 8 in dir ino 55176185
missing free index for data block 9 in dir ino 55176185
missing free index for data block 10 in dir ino 55176185
missing free index for data block 11 in dir ino 55176185
missing free index for data block 12 in dir ino 55176185
missing free index for data block 13 in dir ino 55176185
missing free index for data block 14 in dir ino 55176185
missing free index for data block 15 in dir ino 55176185
missing free index for data block 16 in dir ino 55176185
missing free index for data block 17 in dir ino 55176185
missing free index for data block 18 in dir ino 55176185
missing free index for data block 19 in dir ino 55176185
missing free index for data block 20 in dir ino 55176185
missing free index for data block 21 in dir ino 55176185
missing free index for data block 22 in dir ino 55176185
missing free index for data block 23 in dir ino 55176185
missing free index for data block 24 in dir ino 55176185
missing free index for data block 25 in dir ino 55176185
missing free index for data block 26 in dir ino 55176185
missing free index for data block 27 in dir ino 55176185
missing free index for data block 28 in dir ino 55176185
missing free index for data block 29 in dir ino 55176185
missing free index for data block 30 in dir ino 55176185
missing free index for data block 31 in dir ino 55176185
missing free index for data block 32 in dir ino 55176185
missing free index for data block 33 in dir ino 55176185
missing free index for data block 34 in dir ino 55176185
missing free index for data block 35 in dir ino 55176185
missing free index for data block 36 in dir ino 55176185
missing free index for data block 37 in dir ino 55176185
missing free index for data block 38 in dir ino 55176185
missing free index for data block 39 in dir ino 55176185
missing free index for data block 40 in dir ino 55176185
missing free index for data block 41 in dir ino 55176185
missing free index for data block 42 in dir ino 55176185
missing free index for data block 43 in dir ino 55176185
missing free index for data block 44 in dir ino 55176185
missing free index for data block 45 in dir ino 55176185
missing free index for data block 46 in dir ino 55176185
missing free index for data block 47 in dir ino 55176185
missing free index for data block 48 in dir ino 55176185
missing free index for data block 49 in dir ino 55176185
missing free index for data block 50 in dir ino 55176185
missing free index for data block 51 in dir ino 55176185
missing free index for data block 52 in dir ino 55176185
missing free index for data block 53 in dir ino 55176185
missing free index for data block 54 in dir ino 55176185
missing free index for data block 56 in dir ino 55176185
missing free index for data block 58 in dir ino 55176185
missing free index for data block 60 in dir ino 55176185
missing free index for data block 63 in dir ino 55176185
missing free index for data block 64 in dir ino 55176185
bad free block nvalid/nused 5/-1 for dir ino 59806790 block 16777216
missing free index for data block 0 in dir ino 59806790
missing free index for data block 1 in dir ino 59806790
missing free index for data block 2 in dir ino 59806790
missing free index for data block 3 in dir ino 59806790
missing free index for data block 4 in dir ino 59806790
bad free block nvalid/nused 21/-1 for dir ino 62915542 block 16777216
missing free index for data block 0 in dir ino 62915542
missing free index for data block 1 in dir ino 62915542
missing free index for data block 2 in dir ino 62915542
missing free index for data block 3 in dir ino 62915542
missing free index for data block 4 in dir ino 62915542
missing free index for data block 5 in dir ino 62915542
missing free index for data block 6 in dir ino 62915542
missing free index for data block 7 in dir ino 62915542
missing free index for data block 8 in dir ino 62915542
missing free index for data block 9 in dir ino 62915542
missing free index for data block 10 in dir ino 62915542
missing free index for data block 11 in dir ino 62915542
missing free index for data block 12 in dir ino 62915542
missing free index for data block 13 in dir ino 62915542
missing free index for data block 14 in dir ino 62915542
missing free index for data block 15 in dir ino 62915542
missing free index for data block 16 in dir ino 62915542
missing free index for data block 17 in dir ino 62915542
missing free index for data block 18 in dir ino 62915542
missing free index for data block 19 in dir ino 62915542
missing free index for data block 20 in dir ino 62915542
Seemed that xfs_repair was able to repair it losslessly - lost+found was
empty after repair:
deepdance:/mnt # xfs_repair /dev/hda5
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- clear lost+found (if it exists) ...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
free block 16777216 for directory inode 1747843 bad nused
rebuilding directory inode 1747843
free block 16777216 for directory inode 30448144 bad nused
rebuilding directory inode 30448144
free block 16777216 for directory inode 59806790 bad nused
rebuilding directory inode 59806790
free block 16777216 for directory inode 55176185 bad nused
rebuilding directory inode 55176185
free block 16777216 for directory inode 5012689 bad nused
rebuilding directory inode 5012689
free block 16777216 for directory inode 42681258 bad nused
rebuilding directory inode 42681258
free block 16777216 for directory inode 46142796 bad nused
rebuilding directory inode 46142796
free block 16777216 for directory inode 33641428 bad nused
rebuilding directory inode 33641428
free block 16777216 for directory inode 62915542 bad nused
rebuilding directory inode 62915542
- traversal finished ...
- traversing all unattached subtrees ...
- traversals finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
I reconsidered my conclusion that a hardware failure is unlikely and
tested the partition:
-------------------------------------------------------
deepdance:~ #
badblocks -s -v -n -o /home/martin/XFS-Probleme/badblocks.txt /dev/hda5
Suche nach defekten Bloecken im zerstoerungsfreien Lesen+Schreiben-Modus
Von Block 0 bis 9767488
Suche nach defekten Bloecken (zerstoerungsfreier Lesen+Schreiben-Modus)
Teste mit zufaelligen Mustern: erledigt
Durchgang beendet, 0 defekte Bloecke gefunden.
-------------------------------------------------------
Its in german, I forgot to change the locale, it reports 0 defect blocks
found. The badblock file is zero bytes as well:
-------------------------------------------------------
martin@deepdance:~/XFS-Probleme> ls -l badblocks.txt
-rw-r--r-- 1 root root 0 2006-06-23 18:31 badblocks.txt
-------------------------------------------------------
I did a long SMART selftest using "smartctl -t long /dev/hda". It
completed without errors:
-------------------------------------------------------
deepdance:~ # smartctl -l selftest /dev/hda
smartctl version 5.33 [i686-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00%
5868 -
# 2 Short offline Completed without error 00%
2950 -
# 3 Extended offline Completed without error 00%
2944 -
# 4 Short offline Completed without error 00%
2913 -
{... all further tests without error ...]
-------------------------------------------------------
There have been no tests for a long time due to a mistake
in /etc/smartd.conf which I hopefully corrected today.
So it seems the harddisk is okay. Only thing is 5 of these errors in the
error log (all on disk power-on lifetime 393 hours):
-------------------------------------------------------
deepdance:~ # smartctl -l error /dev/hda
[...]
Error 200 occurred at disk power-on lifetime: 393 hours (16 days + 9
hours)
When the command that caused the error occurred, the device was active
or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
10 59 01 73 65 6c ee Error: IDNF at LBA = 0x0e6c6573 = 241984883
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
20 03 01 73 65 6c ee 00 00:00:50.900 READ SECTOR(S)
c8 03 01 73 65 6c ee 00 00:00:50.900 READ DMA
20 03 01 73 65 6c ee 00 00:00:50.800 READ SECTOR(S)
c8 03 01 73 65 6c ee 00 00:00:50.800 READ DMA
20 03 01 73 65 6c ee 00 00:00:50.700 READ SECTOR(S)
-------------------------------------------------------
They are strange, cause the device does not have that much sectors:
-------------------------------------------------------
deepdance:~ # LANG=C fdisk -lu /dev/hda
Disk /dev/hda: 60.0 GB, 60011642880 bytes
255 heads, 63 sectors/track, 7296 cylinders, total 117210240 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 63 9767519 4883728+ 7 HPFS/NTFS
/dev/hda2 10233405 19535039 4650817+ c W95 FAT32 (LBA)
/dev/hda3 9767520 10233404 232942+ 6 FAT16
/dev/hda4 19535040 117210239 48837600 5 Extended
/dev/hda5 19535103 39070079 9767488+ 83 Linux
/dev/hda6 39070143 58605119 9767488+ 83 Linux
/dev/hda7 58605183 60565049 979933+ 82 Linux swap /
Solaris
/dev/hda8 60565113 117210239 28322563+ 83 Linux
Partition table entries are not in disk order
[I know, I added the extended partition before I resized the FAT32
partition to add a FAT16 one for FreeDOS;)]
-------------------------------------------------------
I intend to ask on the smartmontools mailinglist about those.
And I will run a memtest86 over night.
Any other tips to diagnose a hardware problem? Or do above XFS errors hint
at a software bug?
I did not file this as bug report yet - cause I am not too sure that it is
not a hardware failure. I will do, if you want me to.
I really want to trace that all down. I do not really have the feeling
that my data is safe at the moment (well I have a backup as of today on
an external USB drive).
If XFS gets corrupted again I may switch that partition to ext3. If it
then crashes with ext3 I may be better off replacing that harddisk, even
when I could not diagnose an error with it.
But first that memtest this night... maybe this reveals something.
Regards,
Martin
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
|