xfs-masters
[Top] [All Lists]

[Bug 27492] BUG: unable to handle kernel NULL pointer dereference, on hi

To: xfs-masters@xxxxxxxxxxx
Subject: [Bug 27492] BUG: unable to handle kernel NULL pointer dereference, on high filesystem io
From: bugzilla-daemon@xxxxxxxxxxxxxxxxxxx
Date: Fri, 11 Mar 2011 04:06:06 GMT
Auto-submitted: auto-generated
In-reply-to: <bug-27492-470@xxxxxxxxxxxxxxxxxxxxxxxxx/>
References: <bug-27492-470@xxxxxxxxxxxxxxxxxxxxxxxxx/>
https://bugzilla.kernel.org/show_bug.cgi?id=27492


Katharine Manton <kat@xxxxxxxxxxxxxxxxxx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kat@xxxxxxxxxxxxxxxxxx




--- Comment #16 from Katharine Manton <kat@xxxxxxxxxxxxxxxxxx>  2011-03-11 
04:06:02 ---
I've been having this problem on several systems from 2.6.34 onwards.  Each has
a 2GB partition, rsynced from a local server nightly (after the local server
rsyncs to a server on the 'net.)  Filesystem contains many small files.

All affected systems are 32-bit.  One 32-bit system wasn't affected; it turns
out I'd formatted the partition with the default 4k block size.  The affected
systems are formatted with 1k block size. 'vmalloc=768M' merely reduced the
frequency of the problem.

I now have a test system set up and don't mind spending some time on this (as
I'm forced to stick with 2.6.32 on two systems for now and have re-formatted
the partition with ext3 on another.)  System has a spare drive installed with 3
partitions, formatted with bsize=512, 1k and 4k.

The following occurred while deleting a large number of small files on the
512-byte block fs:

magnum ~ # xfs_info /mnt/512
meta-data=/dev/sdb1              isize=256    agcount=4, agsize=1048576 blks
         =                       sectsz=512   attr=2
data     =                       bsize=512    blocks=4194304, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=512    blocks=20480, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

Mar 11 03:02:21 magnum kernel: vmap allocation for size 4194304 failed: use
vmalloc=<size> to increase size.
Mar 11 03:02:21 magnum kernel: xfs_buf_get: failed to map pages
Mar 11 03:02:21 magnum kernel: BUG: unable to handle kernel NULL pointer
dereference at 00000008
Mar 11 03:02:21 magnum kernel: IP: [<c10fb360>] xfs_da_do_buf+0x4da/0x5e1
Mar 11 03:02:21 magnum kernel: *pde = 00000000 
Mar 11 03:02:21 magnum kernel: Oops: 0000 [#1] PREEMPT SMP 
Mar 11 03:02:21 magnum kernel: last sysfs file:
/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
Mar 11 03:02:21 magnum kernel: Modules linked in: nfs lockd nfs_acl sunrpc
w83627hf hwmon_vid lm90 hwmon autofs4 w83627hf_wdt raid456 async_pq async_xor
xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 md_mod
pata_hpt37x snd_ice1712 nvidia(P) snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427
snd_i2c snd_mpu401_uart snd_intel8x0 snd_rawmidi snd_ac97_codec ac97_bus
snd_pcm snd_seq_device snd_timer usbhid e100 agpgart snd analog hid mii
soundcore ns558 sg pcspkr i2c_amd756 gameport i2c_core parport_pc parport
snd_page_alloc thermal processor button
Mar 11 03:02:21 magnum kernel: 
Mar 11 03:02:21 magnum kernel: Pid: 2725, comm: rm Tainted: P           
2.6.36-gentoo-r5 #10 7DPXDW-P/ 
Mar 11 03:02:21 magnum kernel: EIP: 0060:[<c10fb360>] EFLAGS: 00010246 CPU: 0
Mar 11 03:02:21 magnum kernel: EIP is at xfs_da_do_buf+0x4da/0x5e1
Mar 11 03:02:21 magnum kernel: EAX: 00000001 EBX: f60da400 ECX: fbd5a730 EDX:
00000000
Mar 11 03:02:21 magnum kernel: ESI: 00000000 EDI: 00000000 EBP: c70b5d78 ESP:
c70b5d14
Mar 11 03:02:21 magnum kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Mar 11 03:02:21 magnum kernel: Process rm (pid: 2725, ti=c70b4000 task=f713c740
task.ti=c70b4000)
Mar 11 03:02:21 magnum kernel: Stack:
Mar 11 03:02:21 magnum kernel:  0012c096 00000000 f64cabc0 f64ca5dc d853fb40
00000001 00000000 00000000
Mar 11 03:02:21 magnum kernel: <0> ffffffff f64ca5c0 c70b5d48 c70b5d4c c70b5d50
c1020717 c70b5d58 c1020717
Mar 11 03:02:21 magnum kernel: <0> 00000001 c70b5d64 c102086e f65eeda0 00000001
00000000 c70b5da0 04000018
Mar 11 03:02:21 magnum kernel: Call Trace:
Mar 11 03:02:21 magnum kernel:  [<c1020717>] ? get_parent_ip+0xb/0x31
Mar 11 03:02:21 magnum kernel:  [<c1020717>] ? get_parent_ip+0xb/0x31
Mar 11 03:02:21 magnum kernel:  [<c102086e>] ? sub_preempt_count+0x7c/0x89
Mar 11 03:02:21 magnum kernel:  [<c10fb4c3>] ? xfs_da_read_buf+0x18/0x1d
Mar 11 03:02:21 magnum kernel:  [<c10fc610>] ?
xfs_da_node_lookup_int+0x4d/0x202
Mar 11 03:02:21 magnum kernel:  [<c10fc610>] ?
xfs_da_node_lookup_int+0x4d/0x202
Mar 11 03:02:21 magnum kernel:  [<c110194a>] ?
xfs_dir2_node_removename+0x3f/0x3e5
Mar 11 03:02:21 magnum kernel:  [<c10fd482>] ? xfs_dir2_isleaf+0x16/0x44
Mar 11 03:02:21 magnum kernel:  [<c10fda50>] ? xfs_dir_removename+0xde/0xe6
Mar 11 03:02:21 magnum kernel:  [<c111f85e>] ? xfs_remove+0x1b3/0x2e0
Mar 11 03:02:21 magnum kernel:  [<c1020717>] ? get_parent_ip+0xb/0x31
Mar 11 03:02:21 magnum kernel:  [<c11286ff>] ? xfs_vn_unlink+0x30/0x62
Mar 11 03:02:21 magnum kernel:  [<c108738b>] ? vfs_rmdir+0x52/0x9e
Mar 11 03:02:21 magnum kernel:  [<c1088bca>] ? do_rmdir+0x7f/0xb7
Mar 11 03:02:21 magnum kernel:  [<c10807c8>] ? fput+0x165/0x16d
Mar 11 03:02:21 magnum kernel:  [<c107dfa3>] ? filp_close+0x51/0x5b
Mar 11 03:02:21 magnum kernel:  [<c1088c26>] ? sys_unlinkat+0x24/0x32
Mar 11 03:02:21 magnum kernel:  [<c10025cc>] ? sysenter_do_call+0x12/0x22
Mar 11 03:02:21 magnum kernel: Code: f0 00 c7 45 b8 00 00 00 00 74 13 8b 4d 18
8d 55 f0 b8 01 00 00 00 e8 06 fa ff ff 89 45 b8 83 7d 14 01 0f 85 82 00 00 00
8b 55 b8 <8b> 4a 08 8b 51 08 8b 01 0f c8 86 f2 0f b7 d2 81 fa ee fb 00 00 
Mar 11 03:02:21 magnum kernel: EIP: [<c10fb360>] xfs_da_do_buf+0x4da/0x5e1
SS:ESP 0068:c70b5d14
Mar 11 03:02:21 magnum kernel: CR2: 0000000000000008
Mar 11 03:02:21 magnum kernel: ---[ end trace 118398ff1b25f91d ]---

magnum ~ # gdb /usr/src/linux/vmlinux
GNU gdb (Gentoo 7.2 p1) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /usr/src/linux-2.6.36-gentoo-r5/vmlinux...done.
(gdb) l *(xfs_da_do_buf+0x4da)
0xc10fb360 is in xfs_da_do_buf (fs/xfs/xfs_da_btree.c:2088).
2083                    xfs_dir2_data_t         *data;
2084                    xfs_dir2_free_t         *free;
2085                    xfs_da_blkinfo_t        *info;
2086                    uint                    magic, magic1;
2087
2088                    info = rbp->data;
2089                    data = rbp->data;
2090                    free = rbp->data;
2091                    magic = be16_to_cpu(info->magic);
2092                    magic1 = be32_to_cpu(data->hdr.magic);
(gdb)

Next, I'll download and compile the latest vanilla kernel.  I was testing with
2.6.36-gentoo-r5 as I knew I could reliably trigger this bug with it.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

<Prev in Thread] Current Thread [Next in Thread>