xfs
[Top] [All Lists]

Re: Issue on kernel 2.6.35.9 vanilla

To: Dave Chinner <david@xxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Subject: Re: Issue on kernel 2.6.35.9 vanilla
From: naveen yadav <yad.naveen@xxxxxxxxx>
Date: Wed, 22 Dec 2010 12:30:11 +0530
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=3VdFA9E/Vq4LYVTqumCJuew+4OP09ASMhjBMAiv5Vw8=; b=O7o54Xtn87s6RuwvQ3VhgigVlmvJwpi+rYGqoDUAQLims5udIRtirMzA8Ujh82LchK +Lkn/hbQBwKC6wV2+Z3GfybllClYWSaJMlNgXC78GC7ZQ5t8YzfSJoRGKqRMLeAqZd39 OdOJXgC6n8iQ8/2NomCu+fb6bf/mGBsU4C3VM=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=MUwB+aQPB+2PWHUhTHIAW//OlfFk1Hkcl0QCSTaT9Eiq98AKEO2uaMiqFYk+lzkwFu NasKSKbq7nZMxbEp2XPs6NrdX+ZoqEukMwpOEFOaQKAG4H7elSWgaV9+jxoTejSToxHk n9r00VaXvjF8Pbi32FI6cE7BauuGV4RRD161I=
In-reply-to: <20101222053730.GG4907@dastard>
References: <AANLkTim8VD2NQ1a2t47jOA+MyyVGWejymoJh41jfrRkA@xxxxxxxxxxxxxx> <AANLkTimqrY4z_Xk-8evw88NYshMPnUxcGE2T-WxQyL1o@xxxxxxxxxxxxxx> <20101221213101.GA4907@dastard> <AANLkTik3batsHu57QDUWmspf7C8As75t3mUO6QG_i040@xxxxxxxxxxxxxx> <20101222053730.GG4907@dastard>
Thanks a lot Dave,

Please find attached log for command xfs_repair -n.

Thanks a lot

On Wed, Dec 22, 2010 at 11:07 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Wed, Dec 22, 2010 at 10:27:16AM +0530, naveen yadav wrote:
>> Hi Dave,
>>
>> Please find attached log as suggested by you.
>>
>> Kind regards
>> Naveen
>>
>> On Wed, Dec 22, 2010 at 3:01 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > On Tue, Dec 21, 2010 at 07:41:51PM +0530, naveen yadav wrote:
>> >> Hi all,
>> >>
>> >> We have one disk that got corrupted, when I connect to my PC, haveing
>> >> kernel version(2.6.35.9).
>> >> The Disk mount well, but when i do 'ls; command it hangs.
>> >
>> > ls shouldn't hang. This should return:
>> >
>> >> Please find the dmesg.
>> >> /0x22 [xfs]
>> >>  [<e0dfd2c2>] xfs_da_do_buf+0x582/0x628 [xfs]
>> >>  [<e0dfd3ce>] ? xfs_da_read_buf+0x1d/0x22 [xfs]
>> >>  [<e0dfe0d3>] ? xfs_da_node_lookup_int+0x52/0x207 [xfs]
>> >>  [<e0dfd3ce>] ? xfs_da_read_buf+0x1d/0x22 [xfs]
>> >>  [<e0dfd3ce>] xfs_da_read_buf+0x1d/0x22 [xfs]
>> >>  [<e0dfe0d3>] ? xfs_da_node_lookup_int+0x52/0x207 [xfs]
>> >>  [<e0dfe0d3>] xfs_da_node_lookup_int+0x52/0x207 [xfs]
> .....
>> >> c2d9d000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
>> >>  ................
>> >> Filesystem "sdb2": XFS internal error xfs_da_do_buf(2) at line 2113 of
>> >> file fs/xfs/xfs_da_btree.c.  Caller 0xe0dfd3ce
>
> This is not in your dmesg log. When did it actually happen? Before
> the hung task timer started to trip? From your log:
>
> scsi 5:0:0:0: Direct-Access     SanDisk  Cruzer Blade     1.00 PQ: 0 ANSI: 2
> sd 5:0:0:0: Attached scsi generic sg1 type 0
> sd 5:0:0:0: [sdb] 15625216 512-byte logical blocks: (8.00 GB/7.45 GiB)
> sd 5:0:0:0: [sdb] Write Protect is off
> sd 5:0:0:0: [sdb] Mode Sense: 03 00 00 00
> sd 5:0:0:0: [sdb] Assuming drive cache: write through
> sd 5:0:0:0: [sdb] Assuming drive cache: write through
>  sdb: sdb1 sdb2
> sd 5:0:0:0: [sdb] Assuming drive cache: write through
> sd 5:0:0:0: [sdb] Attached SCSI removable disk
> SELinux: initialized (dev sdb1, type vfat), uses genfs_contexts
> XFS mounting filesystem sdb2
> Starting XFS recovery on filesystem: sdb2 (logdev: internal)
> Ending XFS recovery on filesystem: sdb2 (logdev: internal)
> SELinux: initialized (dev sdb2, type xfs), uses xattr
> INFO: task gvfs-gdu-volume:2311 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> gvfs-gdu-volu D 00000026     0  2311      1 0x00000080
>  c6cf9b2c 00000086 a41cc623 00000026 c0a25e00 c0a25e00 c0a25e00 c0a25e00
>  d1290f54 c0a25e00 c0a25e00 000336ad 00000000 cd871c00 00000026 d1290cd0
>  00000000 cd8d2a08 cd8d2a00 7fffffff 7fffffff c6cf9b70 c0781c43 00000000
> Call Trace:
>  [<c0781c43>] schedule_timeout+0x1b/0x95
>  [<c07824d1>] __down_common+0x82/0xb9
>  [<e0e28ae8>] ? _xfs_buf_find+0x122/0x1b8 [xfs]
>  [<c0782567>] __down+0x17/0x19
>  [<c045827c>] down+0x27/0x37
>  [<e0e278da>] xfs_buf_lock+0x67/0x93 [xfs]
>  [<e0e28ae8>] _xfs_buf_find+0x122/0x1b8 [xfs]
>  [<e0e28bde>] xfs_buf_get+0x60/0x149 [xfs]
>  [<e0e28ce9>] xfs_buf_read+0x22/0xb0 [xfs]
>  [<e0e1ffa9>] xfs_trans_read_buf+0x53/0x2e9 [xfs]
>  [<e0dfd151>] xfs_da_do_buf+0x411/0x628 [xfs]
>  [<e0dfe0d3>] ? xfs_da_node_lookup_int+0x52/0x207 [xfs]
>  [<e0dfd3ce>] xfs_da_read_buf+0x1d/0x22 [xfs]
>  [<e0dfe0d3>] ? xfs_da_node_lookup_int+0x52/0x207 [xfs]
>  [<e0dfe0d3>] xfs_da_node_lookup_int+0x52/0x207 [xfs]
>  [<e0e03888>] xfs_dir2_node_lookup+0x5f/0xee [xfs]
>  [<e0dff26a>] xfs_dir_lookup+0xde/0x110 [xfs]
>  [<e0e22c0a>] xfs_lookup+0x50/0x9f [xfs]
>  [<e0e2c5a6>] xfs_vn_lookup+0x3e/0x76 [xfs]
>  [<c04da3b2>] do_lookup+0xc9/0x139
>  [<c04dbd59>] do_last+0x186/0x49f
>  [<c04dc415>] do_filp_open+0x1bd/0x459
>  [<c045b191>] ? timekeeping_get_ns+0x16/0x54
>  [<c05a9170>] ? might_fault+0x1e/0x20
>  [<c04e479a>] ? alloc_fd+0x58/0xbe
>  [<c04d1941>] do_sys_open+0x4d/0xe4
>  [<c047d0f4>] ? audit_syscall_entry+0x12a/0x14c
>  [<c04d1a24>] sys_open+0x23/0x2b
>  [<c0407fd8>] sysenter_do_call+0x12/0x2d
>
> You've got a gvfs (gnome-vfs?) process stuck waiting on a buffer
> lock. The onyl way I can see it getting stuck here is if it the
> buffer has not been unlocked somewhere. It's possible that it is
> stuck on the same buffer that the corruption error came from,
> but the corrupted buffer is unlocked in the error handling path.
> what does `xfs_repair -n` tell about the filesytsem?
>
> FWIW, later on:
>
> ......
> INFO: task gvfsd-trash:1891 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> gvfsd-trash   D 00000026     0  1891      1 0x00000080
>
> gvfs-trashd gets stuck on a mutex during a path walk which is
> probably held by the above directory read.
>
> ....
> INFO: task gvfs-gdu-volume:2321 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> gvfs-gdu-volu D 00000026     0  2321      1 0x00000088
>
> As does this one.
>
> ....
>
> gvfs-gdu-volu D 00000026     0  1889      1 0x00000080
>  cda3df10 00000086 a422923c 00000026 c0a25e00 c0a25e00 c0a25e00 c0a25e00
>  cd936904 c0a25e00 c0a25e00 000b9ea1 00000000 cda24400 00000026 cd936680
>  00ae3000 c34c01a4 c34c019c cd936680 c34c01a0 cda3df44 c0782093 c34c01ac
> Call Trace:
>  [<c0782093>] __mutex_lock_common+0xe8/0x137
>  [<c0782113>] __mutex_lock_killable_slowpath+0x17/0x19
>  [<c0782160>] ? mutex_lock_killable+0x32/0x45
>  [<c0782160>] mutex_lock_killable+0x32/0x45
>  [<c04deb0a>] vfs_readdir+0x46/0x94
>  [<c04de814>] ? filldir64+0x0/0xf5
>  [<c04debca>] sys_getdents64+0x72/0xb2
>  [<c0407fd8>] sysenter_do_call+0x12/0x2d
>
> And this one, too.
>
> .....
> ls            D 00000000     0  2325   2044 0x00000080
>  c6d03f10 00200086 00000000 00000000 c0a25e00 c0a25e00 c0a25e00 c0a25e00
>  cdacb5c4 c0a25e00 c0a25e00 d6779fa1 00000029 00000000 00000029 cdacb340
>  00000001 c34c01a4 c34c019c cdacb340 c34c01a0 c6d03f44 c0782093 c34c01ac
> Call Trace:
>  [<c0782093>] __mutex_lock_common+0xe8/0x137
>  [<c0782113>] __mutex_lock_killable_slowpath+0x17/0x19
>  [<c0782160>] ? mutex_lock_killable+0x32/0x45
>  [<c0782160>] mutex_lock_killable+0x32/0x45
>  [<c04deb0a>] vfs_readdir+0x46/0x94
>  [<c04de814>] ? filldir64+0x0/0xf5
>  [<c04debca>] sys_getdents64+0x72/0xb2
>  [<c0407fd8>] sysenter_do_call+0x12/0x2d
>
> And finally, there is an ls process that is hung, stuck on a
> directory mutex. Is this the one you were seeing hang rather than
> whatever generated the corrupion report?
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx
>

Attachment: xfs_repair_log
Description: Binary data

<Prev in Thread] Current Thread [Next in Thread>