xfs-masters
[Top] [All Lists]

[xfs-masters] [Bug 742] New: Kernel Oops caused by attempting to mount

To: xfs-master@xxxxxxxxxxx
Subject: [xfs-masters] [Bug 742] New: Kernel Oops caused by attempting to mount XFS filesystem on stopped md RAID0 device.
From: bugzilla-daemon@xxxxxxxxxxx
Date: Thu, 5 Apr 2007 09:21:24 -0700
Reply-to: xfs-masters@xxxxxxxxxxx
Sender: xfs-masters-bounce@xxxxxxxxxxx
http://oss.sgi.com/bugzilla/show_bug.cgi?id=742

           Summary: Kernel Oops caused by attempting to mount XFS filesystem
                    on stopped md RAID0 device.
           Product: Linux XFS
           Version: unspecified
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: P2
         Component: XFS kernel code
        AssignedTo: xfs-master@xxxxxxxxxxx
        ReportedBy: jf@xxxxxxxxxxxxxxx


Overview description: Kernel Oops caused by attempting to mount XFS filesystem
on stopped md RAID0  device. The problem does not occur under the same
conditions mounting an ext3 or Reiserfs on a stopped md device.


Running environment:

Machine is running SLES10 (kernel 2.6.16.27-0.6-smp) on x86_64.

Machine has an 8-member RAID0 md device as follows:

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Wed Apr  4 17:42:57 2007
     Raid Level : raid0
     Array Size : 819101696 (781.16 GiB 838.76 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Apr  4 17:42:57 2007
          State : clean
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 4096K

           UUID : fb5c1ece:43617e75:ec2affd8:30c0a930
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8       65        0      active sync   /dev/sde1
       1       8       81        1      active sync   /dev/sdf1
       2       8       97        2      active sync   /dev/sdg1
       3       8      113        3      active sync   /dev/sdh1
       4       8      129        4      active sync   /dev/sdi1
       5       8      145        5      active sync   /dev/sdj1
       6       8      161        6      active sync   /dev/sdk1
       7       8      177        7      active sync   /dev/sdl1



Steps to reproduce the problem:

Make an XFS filesystem on /dev/md0 and mount it on /data/test

# mkfs.xfs -f /dev/md0

meta-data=/dev/md0               isize=256    agcount=32, agsize=6400000 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=204775424, imaxpct=25
         =                       sunit=1024   swidth=8192 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal log           bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=33554432 blocks=0, rtextents=0

# mount -t xfs /dev/md0 /data/test -o attr2,osyncisdsync,noatime,nobarrier

Unmount it, and stop the md device:

# umount /data/test
# mdadm -S /dev/md0

# tail -n 20 /var/log/messages

Apr  5 09:47:46 TPC-DAL-SUSE2 kernel: XFS: osyncisdsync is now the default,
option is deprecated.
Apr  5 09:47:46 TPC-DAL-SUSE2 kernel: XFS mounting filesystem md0
Apr  5 09:47:46 TPC-DAL-SUSE2 kernel: Ending clean XFS mount for filesystem: md0
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: md0 stopped.
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sde1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sde1)
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sdl1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdl1)
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sdk1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdk1)
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sdj1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdj1)
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sdi1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdi1)
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sdh1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdh1)
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sdg1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdg1)
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: unbind<sdf1>
Apr  5 09:48:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdf1)

Trying the mount on the stopped md0 device Oopses the kernel.

# mount -t xfs /dev/md0 /data/test -o attr2,osyncisdsync,noatime,nobarrier
Killed

# tail -n 40 /var/log/messages

Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: XFS: osyncisdsync is now the default,
option is deprecated.
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: XFS: SB read failed
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Unable to handle kernel NULL pointer
dereference at 0000000000000008 RIP:
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: <ffffffff8840b333>{:raid0:raid0_unplug+17}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: PGD 127d11067 PUD 11810b067 PMD 0
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Oops: 0000 [1] SMP
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: last sysfs file: /kernel/uevent_seqnum
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: CPU 3
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Modules linked in: ext3 jbd raid0 joydev
st xfs_quota xfs exportfs ipv6 button battery ac sr_mod loop dm_mod usb_storage
usbhid hw_random shpchp pci_hotplug e1000 uhci_hcd ehci_hcd ide_cd usbcore cdrom
bnx2 reiserfs qla2400 qla2xxx firmware_class qla2xxx_conf intermodule edd fan
thermal processor sg megaraid_sas piix sd_mod scsi_mod ide_disk ide_core
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Pid: 14419, comm: mount Tainted: G     U
2.6.16.27-0.6-smp #1
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: RIP: 0010:[<ffffffff8840b333>]
<ffffffff8840b333>{:raid0:raid0_unplug+17}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: RSP: 0018:ffff81010a66ba68  EFLAGS: 
00010246
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: RAX: 0000000000000000 RBX:
ffff81010a66ba40 RCX: ffff81010a66ba88
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: RDX: 0000000000000040 RSI:
0000000000000000 RDI: ffff81012ad3c6c0
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: RBP: 0000000000000000 R08:
ffffffff8045b760 R09: ffff810128804c80
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: R10: ffff8101272612c0 R11:
ffffffff8840b322 R12: ffff81012a8d9800
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: R13: ffff810125602e40 R14:
0000000000000000 R15: 0000000000000001
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: FS:  00002b77342f16d0(0000)
GS:ffff81012b7d1a40(0000) knlGS:0000000000000000
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: CR2: 0000000000000008 CR3:
0000000124254000 CR4: 00000000000006e0
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Process mount (pid: 14419, threadinfo
ffff81010a66a000, task ffff81012af66790)
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Stack: ffff81010a66ba40 ffff81010a66ba88
ffff81010a66ba40 ffffffff883b9f7c
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:        ffff81010a66ba88 ffff81010a66ba88
0000000000000200 0000000000000005
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:        ffff810127f48800 ffff810125029000
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Call Trace:
<ffffffff883b9f7c>{:xfs:xfs_flush_buftarg+389}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:       
<ffffffff883aa457>{:xfs:xfs_mount+1931}
<ffffffff883b830e>{:xfs:linvfs_fill_super+145}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:       
<ffffffff801932e3>{get_filesystem+18} <ffffffff801808d1>{sget+1039}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:       
<ffffffff80180118>{set_bdev_super+0} <ffffffff80180127>{test_bdev_super+0}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:        <ffffffff80180f8d>{get_sb_bdev+258}
<ffffffff883b827d>{:xfs:linvfs_fill_super+0}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:       
<ffffffff80180bd7>{do_kern_mount+165} <ffffffff801950a2>{do_mount+1727}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:       
<ffffffff80164c3f>{__handle_mm_fault+781} <ffffffff802d2392>{do_page_fault+1017}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:        <ffffffff801e8050>{__up_read+19}
<ffffffff802d2392>{do_page_fault+1017}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:       
<ffffffff801899bf>{do_path_lookup+629} <ffffffff8015cc26>{__alloc_pages+101}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:        <ffffffff8019519c>{sys_mount+138}
<ffffffff8010a7be>{system_call+126}
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel:
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: Code: 48 8b 40 08 48 8b 58 20 eb 26 48 8b
03 48 8b 40 28 48 8b 80
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: RIP
<ffffffff8840b333>{:raid0:raid0_unplug+17} RSP <ffff81010a66ba68>
Apr  5 09:50:23 TPC-DAL-SUSE2 kernel: CR2: 0000000000000008
TPC-DAL-SUSE2:~ #

The same steps will invariably reproduce the problem.

Additional information:

The same test with an ext3 filesystem simply causes the mount to fail, with no
damage.

# mkfs.ext3 -O dir_index,filetype /dev/md0
mke2fs 1.38 (30-Jun-2005)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
102400000 inodes, 204775424 blocks
10238771 blocks (5.00%) reserved for the super user
First data block=0
6250 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 23 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

# mount -t ext3 /dev/md0 /data/test -o noacl,user_xattr,noatime

# mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or other error
       (could this be the IDE device where you in fact use
       ide-scsi so that sr0 or sda or so is needed?)
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

tail -n 20 /var/log/messages

Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: md0 stopped.
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sdl1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sdl1)
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sdk1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sdk1)
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sdj1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sdj1)
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sdi1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sdi1)
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sdh1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sdh1)
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sdg1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sdg1)
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sdf1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sdf1)
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: unbind<sde1>
Apr  4 18:29:01 TPC-DAL-SUSE2 kernel: md: export_rdev(sde1)
Apr  4 18:29:59 TPC-DAL-SUSE2 kernel: EXT3-fs: unable to read superblock

The same test with a Reiserfs filesystem also causes the mount to simply fail,
with no  further damage.

# mdadm -A /dev/md0
mdadm: /dev/md0 has been started with 8 drives.
 
# mkfs.reiserfs /dev/md0
mkfs.reiserfs 3.6.19 (2003 www.namesys.com)

A pair of credits:
Alexander Zarochentcev  (zam)  wrote the high low priority locking code, online
resizer for V3 and V4, online repacker for V4, block allocation code, and major
parts of  the flush code,  and maintains the transaction manager code.  We give
him the stuff  that we know will be hard to debug,  or needs to be very cleanly
structured.

BigStorage  (www.bigstorage.com)  contributes to our general fund  every month,
and has done so for quite a long time.


Guessing about desired format.. Kernel 2.6.16.27-0.6-smp is running.
Format 3.6 with standard journal
Count of blocks on the device: 204775424
Number of blocks consumed by mkreiserfs formatting process: 14461
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: d1a9b9f2-8992-47cc-bfb3-ff8dee0cf725
ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
        ALL DATA WILL BE LOST ON '/dev/md0'!
Continue (y/n):y
Initializing journal - 0%....20%....40%....60%....80%....100%
Syncing..ok
ReiserFS is successfully created on /dev/md0.

# mount -t reiserfs /dev/md0 /data/test -o noacl,user_xattr,noatime

# umount /data/test
# mdadm -S /dev/md0
# mount -t reiserfs /dev/md0 /data/test -o noacl,user_xattr,noatime
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or other error
       (could this be the IDE device where you in fact use
       ide-scsi so that sr0 or sda or so is needed?)
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

# tail -n 20 /var/log/messages

Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: md0 stopped.
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sde1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sde1)
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sdl1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdl1)
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sdk1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdk1)
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sdj1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdj1)
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sdi1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdi1)
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sdh1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdh1)
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sdg1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdg1)
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: unbind<sdf1>
Apr  4 19:12:23 TPC-DAL-SUSE2 kernel: md: export_rdev(sdf1)
Apr  4 19:12:41 TPC-DAL-SUSE2 kernel: ReiserFS: md0: warning: sh-2006:
read_super_block:  bread failed (dev md0, block 2, size 4096)
Apr  4 19:12:41 TPC-DAL-SUSE2 kernel: ReiserFS: md0: warning: sh-2006:
read_super_block:  bread failed (dev md0, block 16, size 4096)
Apr  4 19:12:41 TPC-DAL-SUSE2 kernel: ReiserFS: md0: warning: sh-2021:
reiserfs_fill_super:  can not find reiserfs on md0

I understand this could be viewed as a problem with the md device driver, but
the oops occurs only with the XFS filesystem, not ext3 nor Reiserfs.

What else can I provide you with to help solve the issue?

-- 
Configure bugmail: http://oss.sgi.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


<Prev in Thread] Current Thread [Next in Thread>
  • [xfs-masters] [Bug 742] New: Kernel Oops caused by attempting to mount XFS filesystem on stopped md RAID0 device., bugzilla-daemon <=