xfs
[Top] [All Lists]

Hangs during filesystem recovery on mount (was: kernel panic "killing in

To: Linux-XFS Mailing List <linux-xfs@xxxxxxxxxxx>
Subject: Hangs during filesystem recovery on mount (was: kernel panic "killing interrupt handler" and kernel BUG at sched.c:468)
From: Federico Sevilla III <jijo@xxxxxxxxxxx>
Date: Fri, 4 Oct 2002 08:43:22 +0800
In-reply-to: <20020930121323.GA7250@xxxxxxxxxxxxxxxxxxxx>
Mail-followup-to: Linux-XFS Mailing List <linux-xfs@xxxxxxxxxxx>
References: <20020930121323.GA7250@xxxxxxxxxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4i
Hi everyone,

I'm not cc'ing the lkml anymore because with something I found today I
think this is fairly XFS-specific.

On Mon, Sep 30, 2002 at 08:13:24PM +0800, Federico Sevilla III wrote:
> After copying the oops message, I attempted to sync the disks using
> the (Alt + SysRq + S) key combination and after the sync messages I
> hit a kernel BUG at sched.c:568. In my sched.c (different from the XFS
> tree only because of RML's preempt patch) line 568 is in the
> "asmlinkage void schedule(void)" function. The oops (passed through
> ksymoops) is attached as kernel-bug.out.

We had an extended power outage today that the UPS couldn't handle, so
our server died. On bootup, however, the system got stuck at attempting
to recover /dev/sda10 during mounting. I did a reboot and went into
"init=/bin/sh" to manually mount filesystems. All except /dev/sda10 --
which we mount on /opt/data and is a 54GB partition using XFS -- mounted
properly.

While the mount attempt was stuck at recovery I did an Alt+SysRq+S and
found that all devices synced properly except 00:0a (or XX:0a, I cannot
remember, but I'm sure it referred to /dev/sda10). This reminded me of
the above-quoted incident. When the kernel panicked I attempted a sync
and after attempting to sync the same XX:0a device it spewed out yet
another panic, which I also sent an oops of.

After another reboot I did an xfs_check on the yet-unrecovered
/dev/sda10 and I did not get any messages. An attempt to mount it
succeeded. I unmounted /dev/sda10 and did another xfs_check and again I
did not get any messages, which from the manual page I've interpreted as
"clean". Because /dev/sda10 was finally unmounted properly the next
reboot worked and our server is back online.

This is running linux-2.4-xfs CVS checked out on 20020930. The only
patch is Randy Dunlap's sys-magic 20020314 which adds a /proc interface
to the MagicKey. It was built using gcc 2.95.4 and runs Debian
GNU/Linux.

Would anyone have an idea about what's causing this? I do not know if
any of the TAKEs after my checkout affect my problem, and there is
pressure for me now to shift things from XFS to ext3 which I hope I
won't have to do. :(

Thank you for your time. :)

 --> Jijo

-- 
Federico Sevilla III   :  http://jijo.free.net.ph
Network Administrator  :  The Leather Collection, Inc.
GnuPG Key ID           :  0x93B746BE


<Prev in Thread] Current Thread [Next in Thread>