xfs-masters
[Top] [All Lists]

[xfs-masters] [Bug 720] xfs_da_do_buf error under load on Core 5 current

To: xfs-master@xxxxxxxxxxx
Subject: [xfs-masters] [Bug 720] xfs_da_do_buf error under load on Core 5 current kernel - XFS over LVM on 3Ware 9500 HW RAID
From: bugzilla-daemon@xxxxxxxxxxx
Date: Thu, 21 Sep 2006 22:13:18 -0700
Reply-to: xfs-masters@xxxxxxxxxxx
Sender: xfs-masters-bounce@xxxxxxxxxxx
http://oss.sgi.com/bugzilla/show_bug.cgi?id=720





------- Additional Comments From blackavr@xxxxxxxxxxxxx  2006-09-21 22:13 CST 
-------
I've got another instance of what appears to be the same issue on another
machine - same kernel, same model motherboard, but 7x500GB drives on one 3Ware
9500. The access pattern was similar, as well.

Sep 21 11:15:17 fhldef5 kernel: xfs_da_do_buf: bno 2568
Sep 21 11:15:17 fhldef5 kernel: dir: inode 117440528
Sep 21 11:15:17 fhldef5 kernel: Filesystem "dm-5": XFS internal error
xfs_da_do_buf(1) at line 2119 of file fs/xfs/xfs_da_btree.c.  Caller 0xf8a6044e
Sep 21 11:15:17 fhldef5 kernel:  <f8a5fff6> xfs_da_do_buf+0x45b/0x829 [xfs] 
<f8a6044e> xfs_da_read_buf+0x30/0x35 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6044e> xfs_da_read_buf+0x30/0x35 [xfs] 
<f8a6a329> xfs_dir2_node_addname+0x76d/0xa41 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6a329> xfs_dir2_node_addname+0x76d/0xa41
[xfs]  <f8a4ab2a> xfs_attr_fetch+0xb4/0x244 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a81458> xlog_grant_push_ail+0x34/0xf2 [xfs]
 <f8a64031> xfs_dir2_isleaf+0x1b/0x50 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6486b> xfs_dir2_createname+0x101/0x109
[xfs]  <f8a75d9a> xfs_ilock+0x8c/0xd4 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a95b2e> xfs_link+0x342/0x496 [xfs] 
<f8a9e19b> xfs_vn_permission+0x0/0x13 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <c0480a4a> dput+0x35/0x230  <f8a9df13>
xfs_vn_link+0x41/0x8e [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <c0439acb> debug_mutex_add_waiter+0x97/0xa9 
<c0477c13> vfs_link+0xdd/0x190
Sep 21 11:15:17 fhldef5 kernel:  <c06171af> __mutex_lock_slowpath+0x339/0x439 
<c0477c13> vfs_link+0xdd/0x190
Sep 21 11:15:17 fhldef5 kernel:  <c0477c13> vfs_link+0xdd/0x190  <c0477c52>
vfs_link+0x11c/0x190
Sep 21 11:15:17 fhldef5 kernel:  <c047a7ea> sys_linkat+0xb1/0xf0  <c0480b0d>
dput+0xf8/0x230
Sep 21 11:15:17 fhldef5 kernel:  <c046c4ce> __fput+0x146/0x170  <c047a858>
sys_link+0x2f/0x33
Sep 21 11:15:17 fhldef5 kernel:  <c0403dd5> sysenter_past_esp+0x56/0x79 
Sep 21 11:15:17 fhldef5 kernel: xfs_da_do_buf: bno 2568
Sep 21 11:15:17 fhldef5 kernel: dir: inode 117440528
Sep 21 11:15:17 fhldef5 kernel: Filesystem "dm-5": XFS internal error
xfs_da_do_buf(1) at line 2119 of file fs/xfs/xfs_da_btree.c.  Caller 0xf8a6044e
Sep 21 11:15:17 fhldef5 kernel:  <f8a5fff6> xfs_da_do_buf+0x45b/0x829 [xfs] 
<f8a6044e> xfs_da_read_buf+0x30/0x35 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6044e> xfs_da_read_buf+0x30/0x35 [xfs] 
<f8a6a329> xfs_dir2_node_addname+0x76d/0xa41 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6a329> xfs_dir2_node_addname+0x76d/0xa41
[xfs]  <f8a76a06> xfs_iget+0x57a/0x621 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a97b29> kmem_zone_zalloc+0x1d/0x41 [xfs] 
<f8a8dc53> xfs_trans_iget+0x10a/0x143 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8aa0e83> vfs_init_vnode+0x21/0x25 [xfs] 
<f8a64031> xfs_dir2_isleaf+0x1b/0x50 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6486b> xfs_dir2_createname+0x101/0x109
[xfs]  <f8a8e584> xfs_dir_ialloc+0x7b/0x28f [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a8c2d7> xfs_trans_reserve+0xc7/0x18f [xfs] 
<f8a6476a> xfs_dir2_createname+0x0/0x109 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a9513a> xfs_create+0x40b/0x665 [xfs] 
<f8a9dbde> xfs_vn_mknod+0x1a9/0x398 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6ac79> xfs_dir2_leafn_lookup_int+0x3e/0x455
[xfs]  <f8a9a2bf> xfs_buf_rele+0x25/0x7d [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a5f9d4> xfs_da_brelse+0x6b/0x8f [xfs] 
<f8a6919a> xfs_dir2_node_lookup+0x8c/0x95 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6494e> xfs_dir2_lookup+0xdb/0xfe [xfs] 
<c0479a73> __link_path_walk+0xc5c/0xd31
Sep 21 11:15:17 fhldef5 kernel:  <c0480a4a> dput+0x35/0x230  <c04850d5>
mntput_no_expire+0x11/0x6e
Sep 21 11:15:17 fhldef5 kernel:  <f8a8e437> xfs_dir_lookup_int+0x30/0xd8 [xfs] 
<c047811a> vfs_create+0xce/0x12e
Sep 21 11:15:17 fhldef5 kernel:  <c047ab59> open_namei+0x176/0x5db  <c046a00a>
do_filp_open+0x25/0x39
Sep 21 11:15:17 fhldef5 kernel:  <c0618c53> do_page_fault+0x2e7/0x6b2 
<c0469da7> get_unused_fd+0xb9/0xc3
Sep 21 11:15:17 fhldef5 kernel:  <c046a060> do_sys_open+0x42/0xb5  <c046a10c>
sys_open+0x1c/0x1e
Sep 21 11:15:17 fhldef5 kernel:  <c0403dd5> sysenter_past_esp+0x56/0x79 
Sep 21 11:15:17 fhldef5 kernel: Filesystem "dm-5": XFS internal error
xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c.  Caller 0xf8a9534c
Sep 21 11:15:17 fhldef5 kernel:  <f8a8c3f8> xfs_trans_cancel+0x59/0xe5 [xfs] 
<f8a9534c> xfs_create+0x61d/0x665 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a9534c> xfs_create+0x61d/0x665 [xfs] 
<f8a9dbde> xfs_vn_mknod+0x1a9/0x398 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6ac79> xfs_dir2_leafn_lookup_int+0x3e/0x455
[xfs]  <f8a9a2bf> xfs_buf_rele+0x25/0x7d [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a5f9d4> xfs_da_brelse+0x6b/0x8f [xfs] 
<f8a6919a> xfs_dir2_node_lookup+0x8c/0x95 [xfs]
Sep 21 11:15:17 fhldef5 kernel:  <f8a6494e> xfs_dir2_lookup+0xdb/0xfe [xfs] 
<c0479a73> __link_path_walk+0xc5c/0xd31
Sep 21 11:15:17 fhldef5 kernel:  <c0480a4a> dput+0x35/0x230  <c04850d5>
mntput_no_expire+0x11/0x6e
Sep 21 11:15:17 fhldef5 kernel:  <f8a8e437> xfs_dir_lookup_int+0x30/0xd8 [xfs] 
<c047811a> vfs_create+0xce/0x12e
Sep 21 11:15:17 fhldef5 kernel:  <c047ab59> open_namei+0x176/0x5db  <c046a00a>
do_filp_open+0x25/0x39
Sep 21 11:15:17 fhldef5 kernel:  <c0618c53> do_page_fault+0x2e7/0x6b2 
<c0469da7> get_unused_fd+0xb9/0xc3
Sep 21 11:15:17 fhldef5 kernel:  <c046a060> do_sys_open+0x42/0xb5  <c046a10c>
sys_open+0x1c/0x1e
Sep 21 11:15:17 fhldef5 kernel:  <c0403dd5> sysenter_past_esp+0x56/0x79 
Sep 21 11:15:17 fhldef5 kernel: xfs_force_shutdown(dm-5,0x8) called from line
1151 of file fs/xfs/xfs_trans.c.  Return address = 0xf8aa0ea8
Sep 21 11:15:17 fhldef5 kernel: Filesystem "dm-5": Corruption of in-memory data
detected.  Shutting down filesystem: dm-5
Sep 21 11:15:17 fhldef5 kernel: Please umount the filesystem, and rectify the
problem(s)
Sep 21 11:22:02 fhldef5 service_check: service email_bodyd required restart!


This time, I also saved the output of xfs_check and xfs_repair. The pause at 

rebuilding directory inode 117440528

took a good long time, I assume due to the large number of files (it's a hashed
directory structure with a large number of files). However, xfs_repair didn't
seem to grow. 

[root@fhldef5 ~]# xfs_check /dev/mapper/VolGroup00-LogVol07
bad free block nvalid/nused 1/-1 for dir ino 117440528 block 16777216
bad free block nvalid/nused 0/-1 for dir ino 117440528 block 16777217
missing free index for data block 0 in dir ino 117440528

[root@fhldef5 ~]# xfs_repair /dev/mapper/VolGroup00-LogVol07 
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
LEAFN node level is 252 inode 117440528 bno = 8388608
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - clear lost+found (if it exists) ...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
LEAFN node level is 252 inode 117440528 bno = 8388608
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - ensuring existence of lost+found directory
        - traversing filesystem starting at / ... 
free block 16777216 for directory inode 117440528 bad nused
rebuilding directory inode 117440528
        - traversal finished ... 
        - traversing all unattached subtrees ... 
        - traversals finished ... 
        - moving disconnected inodes to lost+found ... 
Phase 7 - verify and correct link counts...
done

-- 
Configure bugmail: http://oss.sgi.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


<Prev in Thread] Current Thread [Next in Thread>