On 2011.06.21 at 14:25 +1000, Dave Chinner wrote:
> On Mon, Jun 20, 2011 at 01:13:59PM +0200, Markus Trippelsdorf wrote:
> > On 2011.06.20 at 08:03 +0200, Markus Trippelsdorf wrote:
> > > OK. I was able to reproduce the same hang on a conventional 512 sector drive.
> > > The partition that I've used was the predecessor to the one on the 4k drive. So
> > > it saw roughly the same usage pattern.
> > > 1 29 70 0 0 0: 1 1 98 0 0 0: 1 9 90 0 0 0: 1 53 46 0 0 0| 0 6400k|2047 1414 |0.46 0.60 0.39| 0 25
> > > 0 4 96 0 0 0: 0 0 100 0 0 0: 1 19 80 0 0 0: 0 80 20 0 0 0| 0 2048k|1425 685 |0.46 0.60 0.39| 0 8
> > > 2 1 97 0 0 0: 1 6 93 0 0 0: 0 5 95 0 0 0: 0 83 17 0 0 0| 0 4608k|1624 849 |0.46 0.60 0.39| 0 18
> > > 2 45 53 0 0 0: 1 16 83 0 0 0: 3 20 77 0 0 0: 1 15 84 0 0 0| 0 6400k|2420 1984 |0.46 0.60 0.39| 0 26
> > > 1 19 80 0 0 0: 2 8 90 0 0 0: 0 33 67 0 0 0: 0 33 67 0 0 0| 0 6400k|2694 2134 |0.59 0.63 0.40| 0 25
> > > 2 7 91 0 0 0: 2 1 97 0 0 0: 1 0 99 0 0 0: 0 49 10 41 0 0| 0 8269k|1865 1571 |0.59 0.63 0.40| 0 363
> > > 1 1 98 0 0 0: 1 1 98 0 0 0: 1 1 98 0 0 0: 0 1 0 99 0 0| 0 4778k|1509 1639 |0.59 0.63 0.40| 0 410
> > > 2 0 98 0 0 0: 2 1 97 0 0 0: 1 1 98 0 0 0: 2 0 0 98 0 0| 0 5318k|1663 1809 |0.59 0.63 0.40| 0 426
> > > 1 1 98 0 0 0: 2 7 91 0 0 0: 1 0 99 0 0 0: 1 0 0 99 0 0| 0 5446k|1659 1806 |0.59 0.63 0.40| 0 432
> > > 0 1 99 0 0 0: 1 0 99 0 0 0: 2 0 98 0 0 0: 0 1 17 82 0 0| 0 5472k|1572 1837 |0.62 0.63 0.40| 0 439
> > > 2 0 98 0 0 0: 2 2 96 0 0 0: 0 1 99 0 0 0: 0 1 99 0 0 0| 0 397k|1058 1049 |0.62 0.63 0.40| 0 36
> > > 1 1 98 0 0 0: 1 1 98 0 0 0: 1 1 98 0 0 0: 0 0 100 0 0 0| 0 0 | 617 689 |0.62 0.63 0.40| 0 0
> > > 9 4 87 0 0 0: 4 0 96 0 0 0: 1 1 98 0 0 0: 8 6 87 0 0 0| 0 0 |1234 1961 |0.62 0.63 0.40| 0 0
> > > 0 1 99 0 0 0: 1 1 98 0 0 0: 0 1 99 0 0 0: 0 1 99 0 0 0| 0 0 | 391 403 |0.62 0.63 0.40| 0 0
> > > 1 0 99 0 0 0: 1 1 98 0 0 0: 0 0 100 0 0 0: 0 0 100 0 0 0| 0 0 | 366 375 |0.57 0.62 0.40| 0 0
>
> What is the resolution of the samples here? Where did the hang occur
> during this output?
Resolution is 1 sec. I think the hang occurred when cpu3 wait switched
from 41% to 99% (but I'm not sure).
> FWIW, Can you capture the hang while running 'iostat -x -d -m 1' so
> we can see what is happening with queue depths, average Io sizes,
> etc?
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdc1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc1 0.00 0.00 1.00 202.00 0.00 6.31 63.72 3.90 19.21 19.00 19.21 2.57 52.10
sdc1 0.00 0.00 0.00 8.00 0.00 0.25 64.00 0.15 18.50 0.00 18.50 2.50 2.00
sdc1 *hang* 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc1 0.00 0.00 0.00 116.00 0.00 3.78 66.76 2.25 18.66 0.00 18.66 2.45 28.40
sdc1 0.00 9.00 0.00 210.00 0.00 2.25 21.92 3.96 19.28 0.00 19.28 4.43 93.00
sdc1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> > Here are two more examples. The time when the hang occurs is marked with
> > "=>":
> > ------cpu0-usage--------------cpu1-usage--------------cpu2-usage--------------cpu3-usage------ --dsk/sdb-- ---system-- ---load-avg--- --dsk/sdb--
> > usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr sys idl wai hiq siq| read writ| int csw | 1m 5m 15m |reads writs
> .....
> > 1 10 89 0 0 0: 2 38 61 0 0 0: 1 8 81 10 0 0: 0 64 36 0 0 0| 0 4476k|2378 2010 |1.12 0.59 0.28| 0 146
> > =>0 0 100 0 0 0: 0 1 99 0 0 0: 0 1 0 99 0 0: 0 100 0 0 0 0| 0 0 |1237 283 |1.12 0.59 0.28| 0 0
> > 0 27 73 0 0 0: 1 0 99 0 0 0: 0 0 47 53 0 0: 0 56 44 0 0 0| 0 5812k|1596 857 |1.12 0.59 0.28| 0 182
> .....
> > 7 26 67 0 0 0: 4 1 95 0 0 0: 1 37 62 0 0 0: 3 38 59 0 0 0| 0 864k|2904 2806 |0.70 0.65 0.40| 0 33
> > =>2 1 97 0 0 0: 1 0 99 0 0 0: 0 1 99 0 0 0: 0 100 0 0 0 0| 0 0 |1725 639 |0.89 0.69 0.41| 0 0
> > 1 1 98 0 0 0: 2 1 97 0 0 0: 2 1 97 0 0 0: 0 71 29 0 0 0| 0 3968k|1793 852 |0.89 0.69 0.41| 0 123
> .....
>
> So in both cases here the hang occurs when there is -zero- IO
> occurring, and a CPU has pegged at 100% in system time. That's CPU
> bound doing <something>. It is possible that the CPU is getting
> caught in a loop somewhere or has a -lot- of processing to do before
> progress is made.
>
> I know a CIL commit can take some time to process all the
> objects in a checkpoint, but I haven't seen anything like this.
> You've only got a relatively small log (100MB) and you're only
> removing a kernel source tree, so there really shouldn't be an
> excessive number of objects built up to process per checkpoint.
>
> FWIW, I'm pretty sure a CPU getting stuck like this in the
> filesystem code should not be causing problems with X or other
> non-filesystem workloads. There's 3 idle CPU cores and you are
> running a preemptible kernel, so I really can't see why system time
> spent in XFS would cause mouse or keyboard updates to not be
> processed in a timely manner. We really need to know what is
> consuming all that CPU time.
>
> That is, you really need to get a profile of the rm -rf process - or
> whatever is consuming the CPU time - (e.g. via perf or ftrace)
> across the hang to so we can narrow down the potential cause of the
> latency. Speaking of which, latencytop might be helpful in
> identifying where input is getting held up....
I've recorded a profile with "perf record -g /home/markus/rm_sync"
~ % cat rm_sync
rm -fr /mnt/tmp/tmp/linux && sync
This is the output of "/perf report -g --stdio" (Just the first entries
the full report is attached).
# Events: 1K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ..................................
#
5.32% rm [kernel.kallsyms] [k] __memcpy
|
--- __memcpy
|
|--90.77%-- _xfs_trans_commit
| |
| |--62.41%-- xfs_itruncate_finish
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| |--19.50%-- xfs_remove
| | xfs_vn_unlink
| | vfs_unlink
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| |--16.70%-- xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --1.39%-- xfs_bmap_finish
| xfs_itruncate_finish
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--5.44%-- xlog_cil_push
| xfs_log_commit_cil
| _xfs_trans_commit
| xfs_remove
| xfs_vn_unlink
| vfs_unlink
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--2.53%-- xfs_btree_insrec
| xfs_btree_insert
| xfs_free_ag_extent
| xfs_free_extent
| xfs_bmap_finish
| xfs_itruncate_finish
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
--1.26%-- xfs_btree_update
xfs_inobt_update
xfs_difree
xfs_ifree
xfs_inactive
xfs_fs_evict_inode
evict
iput
do_unlinkat
sys_unlinkat
system_call_fastpath
unlinkat
3.85% sync [kernel.kallsyms] [k] xfs_inode_ag_walk.isra.7
|
--- xfs_inode_ag_walk.isra.7
xfs_inode_ag_iterator
xfs_sync_data
xfs_quiesce_data
xfs_fs_sync_fs
__sync_filesystem
sync_one_sb
iterate_supers
sync_filesystems
sys_sync
system_call_fastpath
__GI_sync
2.68% sync [kernel.kallsyms] [k] __rcu_read_unlock
|
--- __rcu_read_unlock
mapping_tagged
xfs_sync_inode_data
xfs_inode_ag_walk.isra.7
xfs_inode_ag_iterator
xfs_sync_data
xfs_quiesce_data
xfs_fs_sync_fs
__sync_filesystem
sync_one_sb
iterate_supers
sync_filesystems
sys_sync
system_call_fastpath
__GI_sync
2.56% rm [kernel.kallsyms] [k] xfs_next_bit
|
--- xfs_next_bit
|
|--31.60%-- xfs_log_commit_cil
| _xfs_trans_commit
| |
| |--91.65%-- xfs_itruncate_finish
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --8.35%-- xfs_remove
| xfs_vn_unlink
| vfs_unlink
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--28.93%-- xfs_buf_item_size
| xfs_trans_alloc_log_vecs
| _xfs_trans_commit
| |
| |--72.71%-- xfs_itruncate_finish
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| |--18.18%-- xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --9.11%-- xfs_remove
| xfs_vn_unlink
| vfs_unlink
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--26.33%-- xfs_buf_item_format
| xfs_log_commit_cil
| _xfs_trans_commit
| |
| |--70.10%-- xfs_itruncate_finish
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| |--19.91%-- xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --9.99%-- xfs_remove
| xfs_vn_unlink
| vfs_unlink
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
--13.14%-- xfs_trans_alloc_log_vecs
_xfs_trans_commit
|
|--60.17%-- xfs_remove
| xfs_vn_unlink
| vfs_unlink
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
--39.83%-- xfs_inactive
xfs_fs_evict_inode
evict
iput
do_unlinkat
sys_unlinkat
system_call_fastpath
unlinkat
2.56% rm [kernel.kallsyms] [k] kmem_cache_free
|
--- kmem_cache_free
|
|--39.47%-- xfs_trans_free_item_desc
| |
| |--79.98%-- xfs_trans_free_items
| | xfs_log_commit_cil
| | _xfs_trans_commit
| | |
| | |--50.02%-- xfs_inactive
| | | xfs_fs_evict_inode
| | | evict
| | | iput
| | | do_unlinkat
| | | sys_unlinkat
| | | system_call_fastpath
| | | unlinkat
| | |
| | |--33.35%-- xfs_itruncate_finish
| | | xfs_inactive
| | | xfs_fs_evict_inode
| | | evict
| | | iput
| | | do_unlinkat
| | | sys_unlinkat
| | | system_call_fastpath
| | | unlinkat
| | |
| | --16.63%-- xfs_bmap_finish
| | xfs_itruncate_finish
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --20.02%-- xfs_trans_del_item
| xfs_trans_brelse
| xfs_btree_del_cursor
| |
| |--66.61%-- xfs_free_ag_extent
| | xfs_free_extent
| | xfs_bmap_finish
| | |
| | |--50.15%-- xfs_remove
| | | xfs_vn_unlink
| | | vfs_unlink
| | | do_unlinkat
| | | sys_unlinkat
| | | system_call_fastpath
| | | unlinkat
| | |
| | --49.85%-- xfs_itruncate_finish
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --33.39%-- xfs_difree
| xfs_ifree
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--31.56%-- xfs_buf_get
| xfs_buf_read
| xfs_trans_read_buf
| |
| |--49.95%-- xfs_btree_read_buf_block.constprop.22
| | xfs_btree_lookup_get_block
| | xfs_btree_lookup
| | |
| | |--49.98%-- xfs_alloc_lookup_eq
| | | xfs_free_ag_extent
| | | xfs_free_extent
| | | xfs_bmap_finish
| | | xfs_itruncate_finish
| | | xfs_inactive
| | | xfs_fs_evict_inode
| | | evict
| | | iput
| | | do_unlinkat
| | | sys_unlinkat
| | | system_call_fastpath
| | | unlinkat
| | |
| | |--33.29%-- xfs_free_ag_extent
| | | xfs_free_extent
| | | xfs_bmap_finish
| | | xfs_itruncate_finish
| | | xfs_inactive
| | | xfs_fs_evict_inode
| | | evict
| | | iput
| | | do_unlinkat
| | | sys_unlinkat
| | | system_call_fastpath
| | | unlinkat
| | |
| | --16.74%-- xfs_inobt_lookup
| | xfs_difree
| | xfs_ifree
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| |--24.99%-- xfs_imap_to_bp.isra.9
| | xfs_itobp
| | xfs_iunlink_remove
| | xfs_ifree
| | xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | |
| | |--66.63%-- do_unlinkat
| | | sys_unlinkat
| | | system_call_fastpath
| | | unlinkat
| | |
| | --33.37%-- d_delete
| | vfs_rmdir
| | do_rmdir
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| |--16.70%-- xfs_read_agi
| | |
| | |--50.05%-- xfs_iunlink_remove
| | | xfs_ifree
| | | xfs_inactive
| | | xfs_fs_evict_inode
| | | evict
| | | iput
| | | do_unlinkat
| | | sys_unlinkat
| | | system_call_fastpath
| | | unlinkat
| | |
| | --49.95%-- xfs_iunlink
| | xfs_droplink
| | xfs_remove
| | xfs_vn_unlink
| | vfs_unlink
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --8.36%-- xfs_read_agf
| xfs_alloc_read_agf
| xfs_alloc_fix_freelist
| xfs_free_extent
| xfs_bmap_finish
| xfs_itruncate_finish
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--5.27%-- xfs_trans_free
| _xfs_trans_commit
| |
| |--50.08%-- xfs_inactive
| | xfs_fs_evict_inode
| | evict
| | iput
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --49.92%-- xfs_remove
| xfs_vn_unlink
| vfs_unlink
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--5.27%-- xfs_trans_free_items
| xfs_log_commit_cil
| _xfs_trans_commit
| |
| |--50.13%-- xfs_remove
| | xfs_vn_unlink
| | vfs_unlink
| | do_unlinkat
| | sys_unlinkat
| | system_call_fastpath
| | unlinkat
| |
| --49.87%-- xfs_itruncate_finish
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--2.64%-- xfs_trans_del_item
| xfs_trans_brelse
| xfs_iunlink_remove
| xfs_ifree
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--2.64%-- __rcu_process_callbacks
| rcu_process_callbacks
| __do_softirq
| call_softirq
| do_softirq
| irq_exit
| smp_apic_timer_interrupt
| apic_timer_interrupt
| xfs_btree_lookup
| xfs_inobt_lookup
| xfs_difree
| xfs_ifree
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--2.64%-- free_buffer_head
| try_to_free_buffers
| xfs_vm_releasepage
| try_to_release_page
| block_invalidatepage
| xfs_vm_invalidatepage
| truncate_inode_page
| truncate_inode_pages_range
| truncate_inode_pages
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--2.64%-- xfs_bmap_del_free.constprop.18
| xfs_bmap_finish
| xfs_itruncate_finish
| xfs_inactive
| xfs_fs_evict_inode
| evict
| iput
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--2.63%-- xfs_log_done
| xfs_log_commit_cil
| _xfs_trans_commit
| xfs_remove
| xfs_vn_unlink
| vfs_unlink
| do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
|--2.62%-- do_unlinkat
| sys_unlinkat
| system_call_fastpath
| unlinkat
|
--2.62%-- xfs_difree
xfs_ifree
xfs_inactive
xfs_fs_evict_inode
evict
iput
do_unlinkat
sys_unlinkat
system_call_fastpath
unlinkat
--
Markus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: report_hang.gz
Type: application/octet-stream
Size: 23016 bytes
Desc: not available
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20110621/b1818830/attachment.dll>