XFS hangs

Amit Sahrawat amit.sahrawat83 at gmail.com
Wed Dec 22 00:57:51 CST 2010


I tried changing the locking in

*File :* xfs_sync.c
*Function :* int xfs_quiesce_data(struct xfs_mount *mp)
 /* write superblock and hoover up shutdown errors */
-  error = xfs_sync_fsdata(mp, SYNC_WAIT);
+ error = xfs_sync_fsdata(mp,SYNC_TRYLOCK);

This change was just out of curiousity, I am trying to reproduce the hang
with this, but didn't observe one in last many iterations.
Also, I am looking at possible side effects for the same change. Please let
me know about this.

To add to this, the code area in doubt according to me:
fs/xfs/xfs_buf_item.c
Function: void xfs_buf_iodone_callbacks( xfs_buf_t *bp), in this function,
 XFS_BUF_SET_BRELSE_FUNC(bp,xfs_buf_error_relse); xfs_buf_error_relse is
registered as callback, which will unlock the lock held, but I really doubt
if the callback is getting called. Still analyzing this code area.

Please update me if this is the right direction.

Thanks & Regards,
Amit Sahrawat




On Wed, Dec 22, 2010 at 12:11 PM, Amit Sahrawat
<amit.sahrawat83 at gmail.com>wrote:

> Extremely sorry for inconvenience, will take care about posting complete
> details in future.
>
> *Test Case : *
> cp Complex directory structure(large no of files and directories) to my XFS
> formatted partition:
> cp -ar /LibExe /usb/sda2
> Unplug the USB while the COPY is in progress.
>
> *Storage: *USB Flash, USB HDD (Both)
>
> *Kernel: *2.6.34
> *Target: *MIPS
> *LOGS:*
> usb 2-1: USB disconnect, address 7
> Device sda2, XFS metadata write error block 0x0 in sda2
> xfs_force_shutdown(sda2,0x1) called from line 1004 of file
> fs/xfs/linux-2.6/xfs_buf.c.  Return address = 0x801cc294
> Filesystem "sda2": I/O Error Detected.  Shutting down filesystem: sda2
> Please umount the filesystem, and rectify the problem(s)
>
> Plug in USB Port1
> sd 7:0:0:0: [sdb] Attached SCSI disk
> Filesystem "sda2": xfs_log_force: error 5 returned.
> Filesystem "sda2": xfs_log_force: error 5 returned.
> Filesystem "sda2": xfs_log_force: error 5 returned.
> Filesystem "sda2": xfs_log_force: error 5 returned.
>   INFO: task usb_mount:1858 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> usb_mount        D [84a42440] 8032d62c     0  1858
> 1816                             (user thread)
> Stack : 00000107 00000000 85e7be80 00030002 84a425c8 8032d62c 7fffffff
> 84a42440
>         00000002 8496e200 00000001 00000000 85e7bf00 85e7bef8 7fa2f2e0
> 8032d62c
>         00000001 801d69a8 85e7bd40 801d6b34 85e7bd4c 8032dc6c 00000000
> 801dbc80
>         85e7be80 864315a8 8662c980 00000001 00000742 00000000 00000000
> 84b85800
>         85e7bd90 801d6cc0 7fffffff 84a42440 00000002 8032ee74 00000081
> 804158a0
>         ...
> Call Trace:
> [<8032d574>] __schedule+0x618/0x6b8 from[<8032d62c>] schedule+0x18/0x3c
> [<8032d62c>] schedule+0x18/0x3c from[<8032dc6c>]
> schedule_timeout+0x2c/0x1c0
> [<8032dc6c>] schedule_timeout+0x2c/0x1c0 from[<8032ee74>] __down+0x8c/0xdc
> [<8032ee74>] __down+0x8c/0xdc from[<8004500c>] down+0x40/0x88
> [<8004500c>] down+0x40/0x88 from[<801ca838>] xfs_buf_lock+0xcc/0x15c
> [<801ca838>] xfs_buf_lock+0xcc/0x15c from[<801b71a0>] xfs_getsb+0x38/0x54
> [<801b71a0>] xfs_getsb+0x38/0x54 from[<801d64a8>]
> xfs_sync_fsdata+0x7c/0x154
> [<801d64a8>] xfs_sync_fsdata+0x7c/0x154 from[<801d7284>]
> xfs_quiesce_data+0x34/0x60
> [<801d7284>] xfs_quiesce_data+0x34/0x60 from[<801d3514>]
> xfs_fs_sync_fs+0x30/0xec
> [<801d3514>] xfs_fs_sync_fs+0x30/0xec from[<800ba09c>]
> __fsync_super+0xa4/0xc8
> [<800ba09c>] __fsync_super+0xa4/0xc8 from[<800ba0d4>] fsync_super+0x14/0x28
> [<800ba0d4>] fsync_super+0x14/0x28 from[<800ba4a0>]
> generic_shutdown_super+0x34/0x190
> [<800ba4a0>] generic_shutdown_super+0x34/0x190 from[<800ba654>]
> kill_block_super+0x58/0x80
> [<800ba654>] kill_block_super+0x58/0x80 from[<800bac6c>]
> deactivate_super+0x7c/0x110
> [<800bac6c>] deactivate_super+0x7c/0x110 from[<800d2bbc>]
> sys_umount+0x310/0x358
> [<800d2bbc>] sys_umount+0x310/0x358 from[<8000ff44>] stack_done+0x20/0x3c
>
> -------------------------------------------------------------------------------------
> Filesystem "sda2": xfs_log_force: error 5 returned.
>
> Please let me know in case more information is needed.
>
> Thanks & Regards,
> Amit Sahrawat
>   On Wed, Dec 22, 2010 at 11:32 AM, Dave Chinner <david at fromorbit.com>wrote:
>
>>  On Wed, Dec 22, 2010 at 11:05:26AM +0530, Amit Sahrawat wrote:
>> > Hi,
>> > I am encountering hang of XFS filesystem, please find the logs as given
>> > below:
>> > INFO: task usb_mount:1858 blocked for more than 120 seconds.
>> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
>> message.
>> > usb_mount        D [84a42440] 8032d62c     0  1858
>> > 1816                             (user thread)
>> > Stack : 00000107 00000000 85e7be80 00030002 84a425c8 8032d62c 7fffffff
>> > 84a42440
>> >         00000002 8496e200 00000001 00000000 85e7bf00 85e7bef8 7fa2f2e0
>> > 8032d62c
>> >         00000001 801d69a8 85e7bd40 801d6b34 85e7bd4c 8032dc6c 00000000
>> > 801dbc80
>> >         85e7be80 864315a8 8662c980 00000001 00000742 00000000 00000000
>> > 84b85800
>> >         85e7bd90 801d6cc0 7fffffff 84a42440 00000002 8032ee74 00000081
>> > 804158a0
>> >         ...
>> > Call Trace:
>> > [<8032d574>] __schedule+0x618/0x6b8 from[<8032d62c>] schedule+0x18/0x3c
>> > [<8032d62c>] schedule+0x18/0x3c from[<8032dc6c>]
>> schedule_timeout+0x2c/0x1c0
>> > [<8032dc6c>] schedule_timeout+0x2c/0x1c0 from[<8032ee74>]
>> __down+0x8c/0xdc
>> > [<8032ee74>] __down+0x8c/0xdc from[<8004500c>] down+0x40/0x88
>> > [<8004500c>] down+0x40/0x88 from[<801ca838>] xfs_buf_lock+0xcc/0x15c
>> > [<801ca838>] xfs_buf_lock+0xcc/0x15c from[<801b71a0>]
>> xfs_getsb+0x38/0x54
>> > [<801b71a0>] xfs_getsb+0x38/0x54 from[<801d64a8>]
>> xfs_sync_fsdata+0x7c/0x154
>> > [<801d64a8>] xfs_sync_fsdata+0x7c/0x154 from[<801d7284>]
>> > xfs_quiesce_data+0x34/0x60
>> > [<801d7284>] xfs_quiesce_data+0x34/0x60 from[<801d3514>]
>> > xfs_fs_sync_fs+0x30/0xec
>> > [<801d3514>] xfs_fs_sync_fs+0x30/0xec from[<800ba09c>]
>> > __fsync_super+0xa4/0xc8
>> > [<800ba09c>] __fsync_super+0xa4/0xc8 from[<800ba0d4>]
>> fsync_super+0x14/0x28
>> > [<800ba0d4>] fsync_super+0x14/0x28 from[<800ba4a0>]
>> > generic_shutdown_super+0x34/0x190
>> > [<800ba4a0>] generic_shutdown_super+0x34/0x190 from[<800ba654>]
>> > kill_block_super+0x58/0x80
>> > [<800ba654>] kill_block_super+0x58/0x80 from[<800bac6c>]
>> > deactivate_super+0x7c/0x110
>> > [<800bac6c>] deactivate_super+0x7c/0x110 from[<800d2bbc>]
>> > sys_umount+0x310/0x358
>> > [<800d2bbc>] sys_umount+0x310/0x358 from[<8000ff44>]
>> stack_done+0x20/0x3c
>>
>> Please make sure you paste stack traces cleanly in your emails so we
>> can read them easily.
>> --
>> > After reboot it works fine, but during this state XFS does not works no
>> > operation.
>>
>> What kernel? What did you do to produce the error? What is the output
>> of "echo w > /proc/sysrq-trigger"? Do you have a repeatable test
>> case? What sort of storage are you using? Were there any IO errors
>> before the hang? etc, etc, etc....
>>
>> --
>>
>> For future reference, when you are reporting a problem you need to
>> be specific about what you were doing to cause the problem you are
>> reporting.  Describe your kernel, your storage, your test case, any
>> errors that occurred before the problem you are reporting, etc.
>>
>> We need this information to make any sense of your bug report, but
>> I'm getting tired of having to ask for it every time you report a
>> problem. The more information you put in your bug report, the more
>> likely we are to be able to help you. We don't have unlimited
>> amounts of time (or patience) to drag all the basic details of your
>> problem out of you over 3 or 4 emails, so including it up front will
>> help a lot....
>>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david at fromorbit.com
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20101222/fdafdd7c/attachment.htm>


More information about the xfs mailing list