In this case it looks very like the start of the filesystem got wacked
with a bunch of zeros somehow. The file size limit is somewhat odd,
there is nothing in xfs which will prevent a file from being extremely
large - 2^44 is about where issues would start for buffered I/O. Possibly
the size issue is is an interaction between the vfs in the ac kernel
and xfs - we will run some tests on this.
Your second email about xfs_repair finding everything does not tie in
with this output at all, you lost the root inode in this case.
So far I have been unable to make things fall over here at all - which
is frustrating.
Steve
> This is a multi-part message in MIME format.
> --------------8D5B87496141DA5BD63B7EAF
> Content-Type: text/plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit
>
> Hi guys,
>
> I just managed to get this error and severe fs corruption without RAID,
> mongo, huge filesystem, or anything weird. (BTW I'm a bleeding edge
> kind of guy, so the fs wasn't critical and I've got backups :-). This
> was with the kernel-2.4.9-6SGI_XFS_PR1, and I'm not using any of the
> modules that had symbol problems.
>
> Initially, I was trying to xfsdump and gzip a whole filesystem to an xfs
> on another disk and I got "file size limit exceeded" and "core dumped."
> So I said, "OK so what's the max filesize?" I thought it was pretty
> high for XFS, but apparently not for Linux XFS--I was backing up a ~6 GB
> partition, so the file size had to be less than that. I didn't find any
> clues in a cursory glance at xfs.h, so I decided to test it in a
> not-so-nice way:
>
> dd if=/dev/zero of=test_size.img bs=10240k
>
> The process choked and this is what turned up in the system log:
>
> Oct 24 07:47:08 localhost kernel: xfs_force_shutdown(ide1(22,65),0x8)
> called from line 1120 of file xfs_trans.c. Return address = 0xc01ca409
> Oct 24 07:47:08 localhost kernel: Corruption of in-memory data
> detected. Shutting down filesystem: ide1(22,65)
> Oct 24 07:47:08 localhost kernel: Please umount the filesystem, and
> rectify the problem(s)
>
> When I tried to mount the disk again, I got this error:
>
> Oct 24 07:47:30 localhost kernel: XFS: bad magic number
> Oct 24 07:47:30 localhost kernel: XFS: SB validate failed
>
> xfs_repair had to search for quite awhile to find a good alternate SB.
> I attached the log of xfs_repair so you can see I did a capital job of
> trashing the FS :-) Later, I'll try it again to see if I can reproduce
> the problem, then again with the newer 2.4.9 kernel.
>
> --
> "Jonathan F. Dill" (dill@xxxxxxxxxxxx)
> --------------8D5B87496141DA5BD63B7EAF
> Content-Type: text/plain; charset=iso-8859-1;
> name="xfs_repair.log"
> Content-Transfer-Encoding: quoted-printable
> Content-Disposition: inline;
> filename="xfs_repair.log"
>
> [root@localhost ~]# mount /trans
> mount: wrong fs type, bad option, bad superblock on /dev/hdd1,
> or too many mounted file systems
> [root@localhost ~]# xfs_repair /dev/hdd1
> xfs_repair: warning - cannot set blocksize on block device /dev/hdd1: Inp=
> ut/output error
> Phase 1 - find and verify superblock...
> bad primary superblock - bad magic number !!!
>
> attempting to find secondary superblock...
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E......................................................................=
> =2E..............found candidate secondary superblock...
> verified secondary superblock...
> writing modified primary superblock
> sb root inode value 18446744073709551615 inconsistent with calculated val=
> ue 13835049396628095104
> resetting superblock root inode pointer to 18446744069414584448
> sb realtime bitmap inode 18446744073709551615 inconsistent with calculate=
> d value 13835049396628095105
> resetting superblock realtime bitmap ino pointer to 18446744069414584449
> sb realtime summary inode 18446744073709551615 inconsistent with calculat=
> ed value 13835049396628095106
> resetting superblock realtime summary ino pointer to 18446744069414584450=
>
> Phase 2 - using internal log
> - zero log...
> - scan filesystem freespace and inode maps...
> bad magic # 0x0 for agf 0
> bad version # 0 for agf 0
> bad length 0 for agf 0, should be 262144
> bad magic # 0x0 for agi 0
> bad version # 0 for agi 0
> bad length # 0 for agi 0, should be 262144
> reset bad agf for ag 0
> reset bad agi for ag 0
> bad agbno 0 for btbno root, agno 0
> bad agbno 0 for btbcnt root, agno 0
> bad agbno 0 for inobt root, agno 0
> root inode chunk not found
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> error following ag 0 unlinked list
> - process known inodes and perform inode discovery...
> - agno =3D 0
> imap claims in-use inode 131 is free, correcting imap
> imap claims in-use inode 132 is free, correcting imap
> imap claims in-use inode 133 is free, correcting imap
> imap claims in-use inode 134 is free, correcting imap
> imap claims in-use inode 135 is free, correcting imap
> imap claims in-use inode 136 is free, correcting imap
> imap claims in-use inode 137 is free, correcting imap
> imap claims in-use inode 141 is free, correcting imap
> - agno =3D 1
> - agno =3D 2
> - agno =3D 3
> - agno =3D 4
> - agno =3D 5
> - agno =3D 6
> - agno =3D 7
> - agno =3D 8
> - agno =3D 9
> - agno =3D 10
> - agno =3D 11
> - agno =3D 12
> - agno =3D 13
> - agno =3D 14
> - agno =3D 15
> - agno =3D 16
> - agno =3D 17
> - agno =3D 18
> - agno =3D 19
> - agno =3D 20
> - agno =3D 21
> - agno =3D 22
> - agno =3D 23
> - agno =3D 24
> - agno =3D 25
> - agno =3D 26
> - agno =3D 27
> - agno =3D 28
> - agno =3D 29
> - agno =3D 30
> - agno =3D 31
> - agno =3D 32
> - agno =3D 33
> - agno =3D 34
> - agno =3D 35
> - agno =3D 36
> - agno =3D 37
> - process newly discovered inodes...
> imap claims in-use inode 929 is free, correcting imap
> =2E..snip...
> imap claims in-use inode 991 is free, correcting imap
> imap claims in-use inode 4176897 is free, correcting imap
> =2E..snip...
> imap claims in-use inode 4176959 is free, correcting imap
> found inodes not in the inode allocation tree
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - clear lost+found (if it exists) ...
> - check for inodes claiming duplicate blocks...
> - agno =3D 0
> entry "test_size.img" at block 0 offset 1024 in directory inode 136 refer=
> ences free inode 138
> clearing inode number in entry at offset 1024...
> - agno =3D 1
> - agno =3D 2
> - agno =3D 3
> - agno =3D 4
> - agno =3D 5
> - agno =3D 6
> - agno =3D 7
> - agno =3D 8
> - agno =3D 9
> - agno =3D 10
> - agno =3D 11
> - agno =3D 12
> - agno =3D 13
> - agno =3D 14
> - agno =3D 15
> - agno =3D 16
> - agno =3D 17
> - agno =3D 18
> - agno =3D 19
> - agno =3D 20
> - agno =3D 21
> - agno =3D 22
> - agno =3D 23
> - agno =3D 24
> - agno =3D 25
> - agno =3D 26
> - agno =3D 27
> - agno =3D 28
> - agno =3D 29
> - agno =3D 30
> - agno =3D 31
> - agno =3D 32
> - agno =3D 33
> - agno =3D 34
> - agno =3D 35
> - agno =3D 36
> - agno =3D 37
> Phase 5 - rebuild AG headers and trees...
> - reset superblock...
> Phase 6 - check inode connectivity...
> - resetting contents of realtime bitmap and summary inodes
> - ensuring existence of lost+found directory
> - traversing filesystem starting at / ... =
>
> rebuilding directory inode 136
> - traversal finished ... =
>
> - traversing all unattached subtrees ... =
>
> - traversals finished ... =
>
> - moving disconnected inodes to lost+found ... =
>
> Phase 7 - verify and correct link counts...
> Note - stripe unit (0) and width (0) fields have been reset.
> Please set with mount -o sunit=3D<value>,swidth=3D<value>
> done
>
> --------------8D5B87496141DA5BD63B7EAF--
|