Hi all,
I got the problem I reportet yesterday again, while writing to a
samba-mounted xfs-lvm volume (/fs3) (this time vmware-win2k, yesterday real
win2k). I was writing files to /fs3/scr/dc which is now reported empty:
root@judicator:/fs3/scr > ls -la /fs3/scr/dc
total 0
the client process on windows is happily writing to this folder while I did
the ls.
Also in the parent directory /fs3/scr/ one of my backup-directories
"chimaera_backups" is missing (didn't even touch it for days):
root@judicator:/fs3/scr > ls -la /fs3/scr
ls: /fs3/scr/chimaera_backups: No such file or directory
total 16
drwxrwxrwt 7 root root 88 Feb 25 00:56 .
drwxr-xr-x 5 root root 46 Feb 8 18:19 ..
drwxrwx--- 11 thrawn thrawn 4096 Feb 27 13:54 dc
drwxr-xr-x 6 root root 75 Feb 17 20:06 judicator_backups
drwxrwx--- 7 thrawn thrawn 4096 Feb 26 15:50 old_dc
drwxrwsrwt 3 thrawn admin 4096 Feb 27 13:20 upload
I had this "no such file or directory" problem with reiserfs a year ago,
while writing to nfs shares.
Running xfs_repair -n after umount /fs3 gives me
root@judicator:/ > xfs_repair -n /dev/vg02/lvol1
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!
attempting to find secondary superblock...
...........................................................................................................
I abortet this and did a reboot.
Now xfs_repair -n shows no errors at all, and
root@judicator:/ > xfs_repair /dev/vg02/lvol1
xfs_repair: warning - cannot set blocksize on block device /dev/vg02/lvol1:
Invalid argument
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- clear lost+found (if it exists) ...
- clearing existing "lost+found" inode
- deleting existing "lost+found" entry
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
- traversal finished ...
- traversing all unattached subtrees ...
- traversals finished ...
- moving disconnected inodes to lost+found ...
disconnected dir inode 18376471, moving to lost+found
disconnected dir inode 18800263, moving to lost+found
disconnected dir inode 117385730, moving to lost+found
disconnected dir inode 275899789, moving to lost+found
disconnected dir inode 310212310, moving to lost+found
disconnected dir inode 402676446, moving to lost+found
disconnected dir inode 506154388, moving to lost+found
Phase 7 - verify and correct link counts...
done
I get some empty folders in lost+found, and some files that were from
completely different parts of the directory structure (not /fs3/scr/ where I
was writing)
apart from those files nothing in /fs3/scr seems to be missing, but the
files that were written last seem to be corrupted.
I'm afraid it is still the "ancient" RH-2.4.9-13 kernel, haven't had time to
upgrade (BTW oss.sgi.com seems down for me).
same as yesterday:
athlon 900, 512mb
mount options: /dev/vg02/lvol1 /fs3 xfs
defaults,noatime,nodiratime,logbufs=4 1 2
vg02/lvol1 is a 2-disk stripeset using 2 maxtor 80gb ide drives
/var/log/messages did not show any xfs-related information
any clues would be highly appreciated
thanks
-simon pabst
|