http://oss.sgi.com/bugzilla/show_bug.cgi?id=742
------- Additional Comments From dgc@xxxxxxx 2007-04-15 17:55 CST -------
Unstarted md gives:
# mount -t xfs /dev/md1 /data/test
mount: /dev/md1: can't read superblock
#
Stopped md gives:
Apr 5 09:50:23 TPC-DAL-SUSE2 kernel: XFS: osyncisdsync is now the default,
option is deprecated.
Apr 5 09:50:23 TPC-DAL-SUSE2 kernel: XFS: SB read failed
Apr 5 09:50:23 TPC-DAL-SUSE2 kernel: Unable to handle kernel NULL pointer
dereference at 0000000000000008 RIP:
Apr 5 09:50:23 TPC-DAL-SUSE2 kernel:<ffffffff8840b333>{:raid0:raid0_unplug+17}
These are different errors. The unstarted md error comes from the *mount*
process, not the kernel trying to mount the filesystem. i.e. mount aborts
before calling the mount syscall because it can't read the md device.
The stopped md passes this test in the mount process, an makes the mount
syscall. md is obviously leaving /dev/mdX lying around after it was stopped
in a state where certain things can be done on it but others will fail
badly.
The first failure XFS sees is when it tries to read the superblock via
xfs_readsb(), and that's where the error in the log comes from. Xfs then
enters the mount failure error handling path where it invalidates the
block devices and then returns the error.
The system is then oopsing when unpluging the underlying block device
whilst invalidating the (just allocated) data device before returning
the read error. Basically, we oops trying to unplug the block device.
(xfs_flush_buftarg() calls blk_run_address_space() on the block device
mapping). The other filesystems don't do this unplug, which is why they
are not oopsing the machine.
So, yes, I'd agree that this is an MD bug as it is leaving enough stubs
around for the block device to be opened successfully but does not provide
enough stubs to error out all types of operations, hence some lead
to panics.
I'll attach a hack to XFS to only do the unplug if we flushed something
to disk. That should WAR the problem you are seeing, but it doesn't
prevent the problem if we really had to flush a buffer out. IOWs, the
MD driver really needs to be fixed....
Cheers,
Dave.
--
Configure bugmail: http://oss.sgi.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
|