xfs
[Top] [All Lists]

Re: Log corruption?

To: Eric Sandeen <sandeen@xxxxxxx>
Subject: Re: Log corruption?
From: James Pearson <james-p@xxxxxxxxxxxxxxxxxx>
Date: Fri, 11 Oct 2002 18:44:14 +0100
Cc: Stephen Lord <lord@xxxxxxx>, linux-xfs@xxxxxxxxxxx
Organization: Moving Picture Company
References: <3DA181D2.B78A9C41@moving-picture.com> <1033996292.1053.32.camel@laptop.americas.sgi.com> <3DA19858.75C9E674@moving-picture.com> <3DA707BC.F6F56592@moving-picture.com> <1034357448.13979.9.camel@stout.americas.sgi.com>
Sender: linux-xfs-bounce@xxxxxxxxxxx
If a patch against XFS 1.1 is easy to do, then that'll be fine for the
moment...

Thanks

James Pearson

Eric Sandeen wrote:
> 
> James - I think Steve previously pointed out that there was a recent fix
> that may address this...  We'll get a new 1.2 prerelease spin out there
> soon which will contain it.  It would probably also be fairly easy to
> get you a patch for 1.1 if you'd prefer.
> 
> -Eric
> 
> On Fri, 2002-10-11 at 12:17, James Pearson wrote:
> > It's just happened on one of my workstations - at bootup I get
> > (2.4.18-xfs [XFS 1.1] kernel):
> >
> > XFS mounting filesystem sd(8,2)
> > XFS: WARNING: recovery required on readonly filesystem.
> > XFS: write access will be enabled during mount.
> > Starting XFS recovery on filesystem: sd(8,2) (dev: 8/2)
> > xfs_inotobp: xfs_imap()  returned error 22 on sd(8,2).  Returning error.
> > xfs_iunlink_remove: xfs_inotobp()  returned error 22 on sd(8,2).
> > Returning error
> > xfs_inactive:: xfs_ifree() returned error = 22 on sd(8,2)
> > xfs_force_shutdown(sd(8,2),0x1) called from line 1962 of file
> > xfs_vnodeops.c   Return address = 0xc01cd7a2
> > I/O Error Detected.  Shutting down filesystem: sd(8,2)
> > Please umount the filesystem, and rectify the problem(s)
> > Ending XFS recovery on filesystem: sd(8,2) (dev: 8/2)
> > pivotroot: pivot_root(/sysroot,/sysroot/initrd) failed: 2
> > Freeing unused kernel memory: 252k freed
> > Kernel panic: No init found.  Try passing init= option to kernel
> >
> >
> > If I boot off floppy/CD in rescue mode and try to mount the root
> > partition by hand I get (2.4.7-10SGI_XFS_PR1BOOT kernel):
> >
> > XFS mounting filesystem sd(8,17)
> > Starting XFS recovery on filesystem: sd(8,17) (dev: 8/17)
> > Ending XFS recovery on filesystem: sd(8,17) (dev: 8/17)
> > XFS mounting filesystem sd(8,2)
> > Starting XFS recovery on filesystem: sd(8,2) (dev: 8/2)
> > Unable to handle kernel NULL pointer dereference at virtual address
> > 00000152
> >  printing eip:
> > fc93faf2
> > *pde = 00000000
> > Oops: 0000
> > CPU:    0
> > EIP:    0010:[<fc93faf2>]
> > EFLAGS: 00010246
> > eax: 00000000   ebx: ffffffe8   ecx: c0226d84   edx: fc96e2c0
> > esi: f6aa17e4   edi: f6a6ec00   ebp: 00000000   esp: f7fd58b4
> > ds: 0018   es: 0018   ss: 0018
> > Process mount (pid: 102, stackpage=f7fd5000)
> > Stack: 41d20700 00000000 f6a6ec16 41d20700 fc94cbd0 f6a6ec00 00000000
> > 41d20700
> >        00000000 00000000 f7fd5924 00000000 00000000 c21c2b60 00000000
> > 00000000
> >        00000000 f6a6ed64 f6a6ed64 41d20700 00000000 c21c2b60 f7fd5924
> > 0187d281
> > Call Trace: [<fc94cbd0>] [<fc94d627>] [<fc94734c>] [<fc94f061>]
> > [<c0112f97>]
> >    [<fc92b270>] [<fc94dc43>] [<fc9572e6>] [<c0131522>] [<fc95745c>]
> > [<fc96ebc0>]
> >    [<fc96ebc0>] [<fc95748b>] [<fc96ebc0>] [<fc969098>] [<fc96ebc0>]
> > [<fc96e808>]
> >    [<c012bcfd>] [<c0122467>] [<c012bcb0>] [<c01256ee>] [<c01353c9>]
> > [<c01355bb>]
> >    [<fc96e808>] [<c0135d70>] [<fc96e808>] [<fc96e808>] [<c0136074>]
> > [<c0135f3c>]
> >    [<c0136108>] [<c0106ddb>]
> >
> > Code: 66 83 bb 6a 01 00 00 00 75 10 80 a3 50 01 00 00 f7 53 e8 6b
> >
> > Running xfs_repair -L 'fixes' the problem.
> >
> > James Pearson
> >
> > James Pearson wrote:
> > >
> > > The sequence of events is:
> > >
> > > Machine locks up - probably related to some Xwindows/application problem
> > > (we use the Nvidia drivers)
> > >
> > > Machine is reset
> > >
> > > Kernel boots
> > >
> > > Fails to mount the root (XFS) file system - either with an oops of some
> > > error telling us the file system is corrupt etc.
> > >
> > > Attempts to reset again produce same results above.
> > >
> > > Booting in rescue mode, running 'xfs_repair -L' and rebooting "fixes"
> > > the problem. xfs_repair finds some lost file and puts them in lost+found
> > > - these are usually files from /tmp or /var/tmp.
> > >
> > > This doesn't happen every time a machine locks up, but it occurs may be
> > > once a week or so on one or another of our 60 or so workstations.
> > >
> > > James Pearson
> > >
> > > Stephen Lord wrote:
> > > >
> > > > On Mon, 2002-10-07 at 07:45, James Pearson wrote:
> > > > > We have a number of workstations running RedHat 7.2 with a 2.4.18 XFS
> > > > > 1.1 kernel - every now and then a (different) machine will crash/hang
> > > > > and fail to boot with a kernel oops and/or with XFS errors when it 
> > > > > tries
> > > > > to mount the root file system.
> > > > >
> > > > > The fix is to boot from floppy/CD in rescue mode and run 'xfs_repair 
> > > > > -L'
> > > > > on the root partition. The root file system is them mountable and the
> > > > > machine reboots OK.
> > > > >
> > > > > I don't have exact error messages (don't have time to write down the
> > > > > exact errors, as the priority is to get the machine up and running 
> > > > > ...)
> > > > >
> > > > > Is this a known problem? If it isn't, I'll attempt to get more
> > > > > information when it happens again.
> > > > >
> > > > > James Pearson
> > > > >
> > > >
> > > > Actually, a change just went into the cvs tree this weekend which might
> > > > be related to this, there is some zeroing of part of the log which is
> > > > always supposed to happen during mount. For a readonly mount this was
> > > > not happening - and the root is mounted this way. Should the machine
> > > > be shutdown and rebooted very shortly after this there is a possibility
> > > > of the second mount getting confused by the log contents.
> > > >
> > > > Is there any way this could be what is happening? Is this happening
> > > > on the second of two boots which are close together?
> > > >
> > > > Currently there is no way to get this code except from a cvs kernel,
> > > > we just put out some images of the first alpha of xfs 1.2, the next
> > > > spin of these should include this fix (hint hint Eric).
> > > >
> > > > Steve
> >
> --
> Eric Sandeen      XFS for Linux     http://oss.sgi.com/projects/xfs
> sandeen@xxxxxxx   SGI, Inc.         651-683-3102


<Prev in Thread] Current Thread [Next in Thread>