[ATTN: Sending a copy into the mailing list because Seth Mos was asking for
the shut down error messages]
Eric,
thanks for your patience in this issue -- I had to recreate an XFS filesystem,
and copy the old RAID over, which took some time...
On Thu, 27 Dec 2001 12:22:43 -0600, Eric Sandeen wrote:
>> >Furthermore, two hardware RAIDs:
>> >
>> > /dev/sdc5 343G 111G 232G 32% /mnt/raid
>> > /dev/sdb5 152G 143G 323M 100% /mnt/raid2
>> >
>> > /dev/sdc5 on /mnt/raid type xfs (rw,usrquota)
>> > /dev/sdb5 on /mnt/raid2 type ext2 (rw)
>> >
>> >I'm copying from sdb5 to sdc5 like this:
>> >
>> > /mnt/raid# cp -ax /mnt/raid2/* .
>> >
>> >(No, I'm not missing hidden files/directories in /mnt/raid since there
aren't
>> >any.)
>> >
>> >When I invoke "ls -lahR /mnt/raid" I get TONS of messages like the
following:
>> >
>> > ls: /mnt/raid/daten/cd/Atlas/PROG.MOV/SHAHOT.CST: No such file or
>> >directory
>> > ls: /mnt/raid/daten/cd/Basf Beams Screws Snaps
6.0/Deutsch/Snaps/Install:
>> >No such file or directory
>> > ls: /mnt/raid/daten/cd/Good Fellow/Deutsch/GY/INDEX/WORK: No such file
or
>> >directory
>> > ls: /mnt/raid/daten/cd/K2001/Install/K2001/Deutsch/Bilder: No such
file
>> or
>> >directory
>> >
>> >The strange this is that the files/directories ARE PRESENT on the source
>> >filesystem, but I can't see it on the destination filesystem using "ls."
>>
>> To make sure it's not a hardware-related problem I've re-mkfs'ed the
partition
>> with an ext2 filesystem and copied the whole raid again, this time from
ext2
>> to ext2. Guess what happened? Yup, NO ERRORS AT ALL.
>>
>> Soooooo, the problem clearly seems XFS-related.
>
>I forgot... does "xfs_repair -n" (after your copy operation) show any
>problems?
This time the copy operation could NOT be completed without the XFS filesystem
being shut down:
Dec 29 02:45:40 Fileserver kernel: xfs_force_shutdown(sd(8,37),0x8) called
from line 1020 of file xfs_trans.c. Return address = 0xe094c27a
Dec 29 02:45:40 Fileserver kernel: Corruption of in-memory data detected.
Shutting down filesystem: sd(8,37)
Dec 29 02:45:40 Fileserver kernel: Please umount the filesystem, and rectify
the problem(s)
This is what xfs_repair gives me:
Fileserver:~# xfs_repair -n /dev/sdc5
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!
attempting to find secondary superblock...
..............................................................
[...]
ound candidate secondary superblock...
verified secondary superblock...
would write modified primary superblock
primary superblock would have been modified.
cannot proceed further in no_modify mode.
exiting now.
Next try (it's a scratch FS anyway):
Fileserver:~# xfs_repair /dev/sdc5
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!
attempting to find secondary superblock...
...............................................................
[...]
found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock
sb root inode value 18446744073709551615 inconsistent with calculated value
13835052420285071488
resetting superblock root inode pointer to 18446744069414584448
sb realtime bitmap inode 18446744073709551615 inconsistent with calculated
value 13835052420285071489
resetting superblock realtime bitmap ino pointer to 18446744069414584449
sb realtime summary inode 18446744073709551615 inconsistent with calculated
value 13835052420285071490
resetting superblock realtime summary ino pointer to 18446744069414584450
Phase 2 - using internal log
- zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
Fileserver:~# mount /dev/sdc5
mount: wrong fs type, bad option, bad superblock on /dev/sdc5,
or too many mounted file systems
Dec 29 10:45:22 Fileserver kernel: XFS mounting filesystem sd(8,37)
Dec 29 10:45:22 Fileserver kernel: Starting XFS recovery on filesystem: sd
(8,37)
(dev: 8/37)
Dec 29 10:45:26 Fileserver kernel: XFS: failed to read root inode
Fileserver:~# xfs_repair -L /dev/sdc5
Phase 1 - find and verify superblock...
sb root inode value 18446744073709551615 inconsistent with calculated value
13835052282846118016
resetting superblock root inode pointer to 18446744069414584448
sb realtime bitmap inode 18446744073709551615 inconsistent with calculated
value 13835052282846118017
resetting superblock realtime bitmap ino pointer to 18446744069414584449
sb realtime summary inode 18446744073709551615 inconsistent with calculated
value 13835052282846118018
resetting superblock realtime summary ino pointer to 18446744069414584450
Phase 2 - using internal log
- zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
error following ag 0 unlinked list
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
bad size/format for directory 21381285
problem with directory contents in inode 21381285
cleared inode 21381285
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
bad size/format for directory 253966766
problem with directory contents in inode 253966766
cleared inode 253966766
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
mismatch between format (2) and size (51) in directory ino 553648257
cleared inode 553648257
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
- agno = 39
- agno = 40
bad inode format in inode 676841560
bad inode format in inode 676841560
cleared inode 676841560
- agno = 41
- agno = 42
- agno = 43
- agno = 44
- agno = 45
- agno = 46
- agno = 47
- agno = 48
- agno = 49
- agno = 50
- agno = 51
- agno = 52
- agno = 53
- agno = 54
- agno = 55
- agno = 56
- agno = 57
- agno = 58
- agno = 59
- agno = 60
- agno = 61
- agno = 62
- agno = 63
- agno = 64
- agno = 65
- agno = 66
- agno = 67
- agno = 68
- agno = 69
- agno = 70
- agno = 71
- agno = 72
- agno = 73
- agno = 74
- agno = 75
- agno = 76
- agno = 77
- agno = 78
- agno = 79
- agno = 80
- agno = 81
- agno = 82
bad size/format for directory 1375731843
problem with directory contents in inode 1375731843
cleared inode 1375731843
- agno = 83
- agno = 84
- agno = 85
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- clear lost+found (if it exists) ...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
entry "GPDRAW.LSP" in shortform directory 21381281 references free inode
21381285
junking entry "GPDRAW.LSP" in directory inode 21381281
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
entry "cmnu.icon" at block 0 offset 432 in directory inode 253931421
references free inode 253966766
clearing inode number in entry at offset 432...
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
entry "WEBPAGE.URL" in shortform directory 553648256 references free inode
553648257
junking entry "WEBPAGE.URL" in directory inode 553648256
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
- agno = 39
- agno = 40
- agno = 41
- agno = 42
- agno = 43
- agno = 44
- agno = 45
- agno = 46
- agno = 47
- agno = 48
- agno = 49
- agno = 50
- agno = 51
- agno = 52
- agno = 53
- agno = 54
- agno = 55
- agno = 56
- agno = 57
- agno = 58
- agno = 59
- agno = 60
- agno = 61
- agno = 62
- agno = 63
- agno = 64
- agno = 65
- agno = 66
- agno = 67
- agno = 68
- agno = 69
- agno = 70
- agno = 71
- agno = 72
- agno = 73
- agno = 74
- agno = 75
- agno = 76
- agno = 77
- agno = 78
- agno = 79
- agno = 80
- agno = 81
- agno = 82
entry "NETFLX3.DLL" in shortform directory 1375731840 references free inode
1375731843
junking entry "NETFLX3.DLL" in directory inode 1375731840
- agno = 83
- agno = 84
- agno = 85
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
rebuilding directory inode 1163381148
Fileserver:~# mount /dev/sdc5
Dec 29 10:56:18 Fileserver kernel: XFS mounting filesystem sd(8,37)
Dec 29 10:56:18 Fileserver kernel: XFS quotacheck sd(8,37): Please wait.
Dec 29 10:57:24 Fileserver kernel: XFS quotacheck sd(8,37): Done.
Fileserver:~# ls /mnt/raidNEU/
10MB.test Raid.txt admin daten home lost+found
Again, I seriously doubt it's a hardware issue because I can copy to an ext2
target filesystem without ANY problems whatsoever.
Any ideas?!
Thanks,
Ralf
--
Verkaufe Original-BMW-Raeder: L I N U X .~.
http://adsl-bergs.rz.rwth-aachen.de/~rabe The Choice /V\
of a GNU /( )\
Generation ^^-^^
|