xfs
[Top] [All Lists]

Re: Corruption of in-memory data detected.

To: Jonathan Dill <dill@xxxxxxxxxxxx>, Marc Schmitt <schmitt@xxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxx>, linux-xfs@xxxxxxxxxxx, florin@xxxxxxx
Subject: Re: Corruption of in-memory data detected.
From: Steve Lord <lord@xxxxxxx>
Date: Wed, 24 Oct 2001 10:10:27 -0500
In-reply-to: Message from Steve Lord <lord@xxxxxxx> of "Wed, 24 Oct 2001 09:58:58 CDT." <200110241458.f9OEwwo07245@xxxxxxxxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
> 
> In this case it looks very like the start of the filesystem got wacked
> with a bunch of zeros somehow. The file size limit is somewhat odd,
> there is nothing in xfs which will prevent a file from being extremely
> large - 2^44 is about where issues would start for buffered I/O. Possibly
> the size issue is is an interaction between the vfs in the ac kernel
> and xfs - we will run some tests on this.

And the answer is:

[root@Lite sda3]# dd if=/dev/zero of=test_size.img bs=10240k
dd: writing `test_size.img': No space left on device
747+0 records in
746+0 records out
[root@Lite sda3]# ls -lh
total 7.3G
drwxr-xr-x    7 root     root         4.0k Oct 23 16:57 dumpdir
drwxr-xr-x    3 root     root           20 Oct 23 16:57 restoredir
-rw-r--r--    1 root     root         7.3G Oct 24 10:01 test_size.img
[root@Lite sda3]# uname -a
Linux Lite 2.4.9-6SGI_XFS_PR2 #1 Tue Oct 23 15:34:18 CDT 2001 i686
unknown
[root@Lite sda3]# 

So the size issue is something else - either a system problem on
your end - do you have user limits set to other than the
default?

Steve

> 
> Your second email about xfs_repair finding everything does not tie in
> with this output at all, you lost the root inode in this case.
> 
> So far I have been unable to make things fall over here at all - which
> is frustrating.
> 
> Steve
> 
> > This is a multi-part message in MIME format.
> > --------------8D5B87496141DA5BD63B7EAF
> > Content-Type: text/plain; charset=us-ascii
> > Content-Transfer-Encoding: 7bit
> > 
> > Hi guys,
> > 
> > I just managed to get this error and severe fs corruption without RAID,
> > mongo, huge filesystem, or anything weird.  (BTW I'm a bleeding edge
> > kind of guy, so the fs wasn't critical and I've got backups :-).  This
> > was with the kernel-2.4.9-6SGI_XFS_PR1, and I'm not using any of the
> > modules that had symbol problems.
> > 
> > Initially, I was trying to xfsdump and gzip a whole filesystem to an xfs
> > on another disk and I got "file size limit exceeded" and "core dumped." 
> > So I said, "OK so what's the max filesize?"  I thought it was pretty
> > high for XFS, but apparently not for Linux XFS--I was backing up a ~6 GB
> > partition, so the file size had to be less than that.  I didn't find any
> > clues in a  cursory glance at xfs.h, so I decided to test it in a
> > not-so-nice way:
> > 
> > dd if=/dev/zero of=test_size.img bs=10240k
> > 
> > The process choked and this is what turned up in the system log:
> > 
> > Oct 24 07:47:08 localhost kernel: xfs_force_shutdown(ide1(22,65),0x8)
> > called from line 1120 of file xfs_trans.c.  Return address = 0xc01ca409
> > Oct 24 07:47:08 localhost kernel: Corruption of in-memory data
> > detected.  Shutting down filesystem: ide1(22,65)
> > Oct 24 07:47:08 localhost kernel: Please umount the filesystem, and
> > rectify the problem(s)
> > 
> > When I tried to mount the disk again, I got this error:
> > 
> > Oct 24 07:47:30 localhost kernel: XFS: bad magic number
> > Oct 24 07:47:30 localhost kernel: XFS: SB validate failed
> > 
> > xfs_repair had to search for quite awhile to find a good alternate SB. 
> > I attached the log of xfs_repair so you can see I did a capital job of
> > trashing the FS :-)  Later, I'll try it again to see if I can reproduce
> > the problem, then again with the newer 2.4.9 kernel.
> > 
> > -- 
> > "Jonathan F. Dill" (dill@xxxxxxxxxxxx)
> > --------------8D5B87496141DA5BD63B7EAF
> > Content-Type: text/plain; charset=iso-8859-1;
> >  name="xfs_repair.log"
> > Content-Transfer-Encoding: quoted-printable
> > Content-Disposition: inline;
> >  filename="xfs_repair.log"
> > 
> > [root@localhost ~]# mount /trans
> > mount: wrong fs type, bad option, bad superblock on /dev/hdd1,
> >        or too many mounted file systems
> > [root@localhost ~]# xfs_repair /dev/hdd1
> > xfs_repair: warning - cannot set blocksize on block device /dev/hdd1: Inp=
> > ut/output error
> > Phase 1 - find and verify superblock...
> > bad primary superblock - bad magic number !!!
> > 
> > attempting to find secondary superblock...
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E......................................................................=
> > =2E..............found candidate secondary superblock...
> > verified secondary superblock...
> > writing modified primary superblock
> > sb root inode value 18446744073709551615 inconsistent with calculated val=
> > ue 13835049396628095104
> > resetting superblock root inode pointer to 18446744069414584448
> > sb realtime bitmap inode 18446744073709551615 inconsistent with calculate=
> > d value 13835049396628095105
> > resetting superblock realtime bitmap ino pointer to 18446744069414584449
> > sb realtime summary inode 18446744073709551615 inconsistent with calculat=
> > ed value 13835049396628095106
> > resetting superblock realtime summary ino pointer to 18446744069414584450=
> > 
> > Phase 2 - using internal log
> >         - zero log...
> >         - scan filesystem freespace and inode maps...
> > bad magic # 0x0 for agf 0
> > bad version # 0 for agf 0
> > bad length 0 for agf 0, should be 262144
> > bad magic # 0x0 for agi 0
> > bad version # 0 for agi 0
> > bad length # 0 for agi 0, should be 262144
> > reset bad agf for ag 0
> > reset bad agi for ag 0
> > bad agbno 0 for btbno root, agno 0
> > bad agbno 0 for btbcnt root, agno 0
> > bad agbno 0 for inobt root, agno 0
> > root inode chunk not found
> > Phase 3 - for each AG...
> >         - scan and clear agi unlinked lists...
> > error following ag 0 unlinked list
> >         - process known inodes and perform inode discovery...
> >         - agno =3D 0
> > imap claims in-use inode 131 is free, correcting imap
> > imap claims in-use inode 132 is free, correcting imap
> > imap claims in-use inode 133 is free, correcting imap
> > imap claims in-use inode 134 is free, correcting imap
> > imap claims in-use inode 135 is free, correcting imap
> > imap claims in-use inode 136 is free, correcting imap
> > imap claims in-use inode 137 is free, correcting imap
> > imap claims in-use inode 141 is free, correcting imap
> >         - agno =3D 1
> >         - agno =3D 2
> >         - agno =3D 3
> >         - agno =3D 4
> >         - agno =3D 5
> >         - agno =3D 6
> >         - agno =3D 7
> >         - agno =3D 8
> >         - agno =3D 9
> >         - agno =3D 10
> >         - agno =3D 11
> >         - agno =3D 12
> >         - agno =3D 13
> >         - agno =3D 14
> >         - agno =3D 15
> >         - agno =3D 16
> >         - agno =3D 17
> >         - agno =3D 18
> >         - agno =3D 19
> >         - agno =3D 20
> >         - agno =3D 21
> >         - agno =3D 22
> >         - agno =3D 23
> >         - agno =3D 24
> >         - agno =3D 25
> >         - agno =3D 26
> >         - agno =3D 27
> >         - agno =3D 28
> >         - agno =3D 29
> >         - agno =3D 30
> >         - agno =3D 31
> >         - agno =3D 32
> >         - agno =3D 33
> >         - agno =3D 34
> >         - agno =3D 35
> >         - agno =3D 36
> >         - agno =3D 37
> >         - process newly discovered inodes...
> > imap claims in-use inode 929 is free, correcting imap
> > =2E..snip...
> > imap claims in-use inode 991 is free, correcting imap
> > imap claims in-use inode 4176897 is free, correcting imap
> > =2E..snip...
> > imap claims in-use inode 4176959 is free, correcting imap
> > found inodes not in the inode allocation tree
> > Phase 4 - check for duplicate blocks...
> >         - setting up duplicate extent list...
> >         - clear lost+found (if it exists) ...
> >         - check for inodes claiming duplicate blocks...
> >         - agno =3D 0
> > entry "test_size.img" at block 0 offset 1024 in directory inode 136 refer=
> > ences free inode 138
> >     clearing inode number in entry at offset 1024...
> >         - agno =3D 1
> >         - agno =3D 2
> >         - agno =3D 3
> >         - agno =3D 4
> >         - agno =3D 5
> >         - agno =3D 6
> >         - agno =3D 7
> >         - agno =3D 8
> >         - agno =3D 9
> >         - agno =3D 10
> >         - agno =3D 11
> >         - agno =3D 12
> >         - agno =3D 13
> >         - agno =3D 14
> >         - agno =3D 15
> >         - agno =3D 16
> >         - agno =3D 17
> >         - agno =3D 18
> >         - agno =3D 19
> >         - agno =3D 20
> >         - agno =3D 21
> >         - agno =3D 22
> >         - agno =3D 23
> >         - agno =3D 24
> >         - agno =3D 25
> >         - agno =3D 26
> >         - agno =3D 27
> >         - agno =3D 28
> >         - agno =3D 29
> >         - agno =3D 30
> >         - agno =3D 31
> >         - agno =3D 32
> >         - agno =3D 33
> >         - agno =3D 34
> >         - agno =3D 35
> >         - agno =3D 36
> >         - agno =3D 37
> > Phase 5 - rebuild AG headers and trees...
> >         - reset superblock...
> > Phase 6 - check inode connectivity...
> >         - resetting contents of realtime bitmap and summary inodes
> >         - ensuring existence of lost+found directory
> >         - traversing filesystem starting at / ... =
> > 
> > rebuilding directory inode 136
> >         - traversal finished ... =
> > 
> >         - traversing all unattached subtrees ... =
> > 
> >         - traversals finished ... =
> > 
> >         - moving disconnected inodes to lost+found ... =
> > 
> > Phase 7 - verify and correct link counts...
> > Note - stripe unit (0) and width (0) fields have been reset.
> > Please set with mount -o sunit=3D<value>,swidth=3D<value>
> > done
> > 
> > --------------8D5B87496141DA5BD63B7EAF--
> 



<Prev in Thread] Current Thread [Next in Thread>