<div>I tried changing the locking in </div>
<div> </div>
<div><strong>File :</strong> xfs_sync.c</div>
<div><strong>Function :</strong> int xfs_quiesce_data(struct xfs_mount *mp)</div>
<div> /* write superblock and hoover up shutdown errors */<br>- error = xfs_sync_fsdata(mp, SYNC_WAIT);<br>+ error = xfs_sync_fsdata(mp,SYNC_TRYLOCK);</div>
<div> </div>
<div>This change was just out of curiousity, I am trying to reproduce the hang with this, but didn't observe one in last many iterations.</div>
<div>Also, I am looking at possible side effects for the same change. Please let me know about this.</div>
<div> </div>
<div>To add to this, the code area in doubt according to me:</div>
<div>fs/xfs/xfs_buf_item.c</div>
<div>Function: void xfs_buf_iodone_callbacks( xfs_buf_t *bp), in this function,</div>
<div> XFS_BUF_SET_BRELSE_FUNC(bp,xfs_buf_error_relse); xfs_buf_error_relse is registered as callback, which will unlock the lock held, but I really doubt if the callback is getting called. Still analyzing this code area.</div>
<div> </div>
<div>Please update me if this is the right direction.</div>
<div> </div>
<div>Thanks & Regards,</div>
<div>Amit Sahrawat</div>
<div> </div>
<div><br><br> </div>
<div class="gmail_quote">On Wed, Dec 22, 2010 at 12:11 PM, Amit Sahrawat <span dir="ltr"><<a href="mailto:amit.sahrawat83@gmail.com">amit.sahrawat83@gmail.com</a>></span> wrote:<br>
<blockquote style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex" class="gmail_quote">
<div>Extremely sorry for inconvenience, will take care about posting complete details in future.</div>
<div> </div>
<div><strong>Test Case : </strong></div>
<div>cp Complex directory structure(large no of files and directories) to my XFS formatted partition:</div>
<div>cp -ar /LibExe /usb/sda2</div>
<div>Unplug the USB while the COPY is in progress.</div>
<div> </div>
<div><strong>Storage: </strong>USB Flash, USB HDD (Both)</div>
<div> </div>
<div><strong>Kernel: </strong>2.6.34</div>
<div><strong>Target: </strong>MIPS</div>
<div><strong>LOGS:</strong></div>
<div>usb 2-1: USB disconnect, address 7<br>Device sda2, XFS metadata write error block 0x0 in sda2<br>xfs_force_shutdown(sda2,0x1) called from line 1004 of file fs/xfs/linux-2.6/xfs_buf.c. Return address = 0x801cc294<br>
Filesystem "sda2": I/O Error Detected. Shutting down filesystem: sda2<br>Please umount the filesystem, and rectify the problem(s)</div>
<div> </div>
<div>Plug in USB Port1<br>sd 7:0:0:0: [sdb] Attached SCSI disk</div>
<div>Filesystem "sda2": xfs_log_force: error 5 returned.</div>
<div>Filesystem "sda2": xfs_log_force: error 5 returned.<br>Filesystem "sda2": xfs_log_force: error 5 returned.</div>
<div>Filesystem "sda2": xfs_log_force: error 5 returned.</div>
<div>
<div></div>
<div class="h5">
<div>INFO: task usb_mount:1858 blocked for more than 120 seconds.<br>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>usb_mount D [84a42440] 8032d62c 0 1858 1816 (user thread)<br>
Stack : 00000107 00000000 85e7be80 00030002 84a425c8 8032d62c 7fffffff 84a42440<br> 00000002 8496e200 00000001 00000000 85e7bf00 85e7bef8 7fa2f2e0 8032d62c<br> 00000001 801d69a8 85e7bd40 801d6b34 85e7bd4c 8032dc6c 00000000 801dbc80<br>
85e7be80 864315a8 8662c980 00000001 00000742 00000000 00000000 84b85800<br> 85e7bd90 801d6cc0 7fffffff 84a42440 00000002 8032ee74 00000081 804158a0<br> ...<br>Call Trace:<br>[<8032d574>] __schedule+0x618/0x6b8 from[<8032d62c>] schedule+0x18/0x3c<br>
[<8032d62c>] schedule+0x18/0x3c from[<8032dc6c>] schedule_timeout+0x2c/0x1c0<br>[<8032dc6c>] schedule_timeout+0x2c/0x1c0 from[<8032ee74>] __down+0x8c/0xdc<br>[<8032ee74>] __down+0x8c/0xdc from[<8004500c>] down+0x40/0x88<br>
[<8004500c>] down+0x40/0x88 from[<801ca838>] xfs_buf_lock+0xcc/0x15c<br>[<801ca838>] xfs_buf_lock+0xcc/0x15c from[<801b71a0>] xfs_getsb+0x38/0x54<br>[<801b71a0>] xfs_getsb+0x38/0x54 from[<801d64a8>] xfs_sync_fsdata+0x7c/0x154<br>
[<801d64a8>] xfs_sync_fsdata+0x7c/0x154 from[<801d7284>] xfs_quiesce_data+0x34/0x60<br>[<801d7284>] xfs_quiesce_data+0x34/0x60 from[<801d3514>] xfs_fs_sync_fs+0x30/0xec<br>[<801d3514>] xfs_fs_sync_fs+0x30/0xec from[<800ba09c>] __fsync_super+0xa4/0xc8<br>
[<800ba09c>] __fsync_super+0xa4/0xc8 from[<800ba0d4>] fsync_super+0x14/0x28<br>[<800ba0d4>] fsync_super+0x14/0x28 from[<800ba4a0>] generic_shutdown_super+0x34/0x190<br>[<800ba4a0>] generic_shutdown_super+0x34/0x190 from[<800ba654>] kill_block_super+0x58/0x80<br>
[<800ba654>] kill_block_super+0x58/0x80 from[<800bac6c>] deactivate_super+0x7c/0x110<br>[<800bac6c>] deactivate_super+0x7c/0x110 from[<800d2bbc>] sys_umount+0x310/0x358<br>[<800d2bbc>] sys_umount+0x310/0x358 from[<8000ff44>] stack_done+0x20/0x3c</div>
</div></div>
<div>-------------------------------------------------------------------------------------<br>Filesystem "sda2": xfs_log_force: error 5 returned.</div>
<div><br>Please let me know in case more information is needed.</div>
<div> </div>
<div>Thanks & Regards,</div>
<div>Amit Sahrawat<font color="#888888"><br></font></div>
<div>
<div></div>
<div class="h5">
<div class="gmail_quote">On Wed, Dec 22, 2010 at 11:32 AM, Dave Chinner <span dir="ltr"><<a href="mailto:david@fromorbit.com" target="_blank">david@fromorbit.com</a>></span> wrote:<br>
<blockquote style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex" class="gmail_quote">
<div>
<div></div>
<div>On Wed, Dec 22, 2010 at 11:05:26AM +0530, Amit Sahrawat wrote:<br>> Hi,<br>> I am encountering hang of XFS filesystem, please find the logs as given<br>> below:<br>> INFO: task usb_mount:1858 blocked for more than 120 seconds.<br>
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>> usb_mount D [84a42440] 8032d62c 0 1858<br>> 1816 (user thread)<br>> Stack : 00000107 00000000 85e7be80 00030002 84a425c8 8032d62c 7fffffff<br>
> 84a42440<br>> 00000002 8496e200 00000001 00000000 85e7bf00 85e7bef8 7fa2f2e0<br>> 8032d62c<br>> 00000001 801d69a8 85e7bd40 801d6b34 85e7bd4c 8032dc6c 00000000<br>> 801dbc80<br>> 85e7be80 864315a8 8662c980 00000001 00000742 00000000 00000000<br>
> 84b85800<br>> 85e7bd90 801d6cc0 7fffffff 84a42440 00000002 8032ee74 00000081<br>> 804158a0<br>> ...<br>> Call Trace:<br>> [<8032d574>] __schedule+0x618/0x6b8 from[<8032d62c>] schedule+0x18/0x3c<br>
> [<8032d62c>] schedule+0x18/0x3c from[<8032dc6c>] schedule_timeout+0x2c/0x1c0<br>> [<8032dc6c>] schedule_timeout+0x2c/0x1c0 from[<8032ee74>] __down+0x8c/0xdc<br>> [<8032ee74>] __down+0x8c/0xdc from[<8004500c>] down+0x40/0x88<br>
> [<8004500c>] down+0x40/0x88 from[<801ca838>] xfs_buf_lock+0xcc/0x15c<br>> [<801ca838>] xfs_buf_lock+0xcc/0x15c from[<801b71a0>] xfs_getsb+0x38/0x54<br>> [<801b71a0>] xfs_getsb+0x38/0x54 from[<801d64a8>] xfs_sync_fsdata+0x7c/0x154<br>
> [<801d64a8>] xfs_sync_fsdata+0x7c/0x154 from[<801d7284>]<br>> xfs_quiesce_data+0x34/0x60<br>> [<801d7284>] xfs_quiesce_data+0x34/0x60 from[<801d3514>]<br>> xfs_fs_sync_fs+0x30/0xec<br>
> [<801d3514>] xfs_fs_sync_fs+0x30/0xec from[<800ba09c>]<br>> __fsync_super+0xa4/0xc8<br>> [<800ba09c>] __fsync_super+0xa4/0xc8 from[<800ba0d4>] fsync_super+0x14/0x28<br>> [<800ba0d4>] fsync_super+0x14/0x28 from[<800ba4a0>]<br>
> generic_shutdown_super+0x34/0x190<br>> [<800ba4a0>] generic_shutdown_super+0x34/0x190 from[<800ba654>]<br>> kill_block_super+0x58/0x80<br>> [<800ba654>] kill_block_super+0x58/0x80 from[<800bac6c>]<br>
> deactivate_super+0x7c/0x110<br>> [<800bac6c>] deactivate_super+0x7c/0x110 from[<800d2bbc>]<br>> sys_umount+0x310/0x358<br>> [<800d2bbc>] sys_umount+0x310/0x358 from[<8000ff44>] stack_done+0x20/0x3c<br>
<br></div></div>Please make sure you paste stack traces cleanly in your emails so we<br>can read them easily.<br>
<div>--<br>> After reboot it works fine, but during this state XFS does not works no<br>> operation.<br><br></div>What kernel? What did you do to produce the error? What is the output<br>of "echo w > /proc/sysrq-trigger"? Do you have a repeatable test<br>
case? What sort of storage are you using? Were there any IO errors<br>before the hang? etc, etc, etc....<br><br>--<br><br>For future reference, when you are reporting a problem you need to<br>be specific about what you were doing to cause the problem you are<br>
reporting. Describe your kernel, your storage, your test case, any<br>errors that occurred before the problem you are reporting, etc.<br><br>We need this information to make any sense of your bug report, but<br>I'm getting tired of having to ask for it every time you report a<br>
problem. The more information you put in your bug report, the more<br>likely we are to be able to help you. We don't have unlimited<br>amounts of time (or patience) to drag all the basic details of your<br>problem out of you over 3 or 4 emails, so including it up front will<br>
help a lot....<br><br>Cheers,<br><br>Dave.<br><font color="#888888">--<br>Dave Chinner<br><a href="mailto:david@fromorbit.com" target="_blank">david@fromorbit.com</a><br></font></blockquote></div><br></div></div></blockquote>
</div><br>