| To: | xfs@xxxxxxxxxxx |
|---|---|
| Subject: | XFS file system hang |
| From: | Jinesh Choksi <jinesh.choksi@xxxxxxxxxx> |
| Date: | Mon, 18 Jul 2016 11:16:28 +0100 |
| Delivered-to: | xfs@xxxxxxxxxxx |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=algomi-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=6jfH4vxlJ8mmx7DNpH0g1vmwfG0rLDBu5HCnjOgBF5w=; b=1PA3wYARIcJfF7veoqcrJZvfloe3GLTh4g2pMbUaT7JSlxjwQ1k8QTDHCNHs3v+v6b Vp1VOeL77JX1wJgCMIrdiw5jYwshYrwNB2WS17kZwuvWiPtLmC1nlc2XhcT9RcxOOaGI qFErlRbCq9cWkRE4hxUc0XA4hfLWZfHOwZBlSq+UgecIK6LkXalC7OBmnVcz/38xORm5 y+r4OI4O0HHglceFpFxwkrWeky41fp0bu5i0HWw72eVSThJXabA6fl/pSKi82ds13ayP 2c1iUc+P6L9ehqwZdpio8TuiFL8gUsOZQu6vWdF/JQGlVZnmOSXTFvSS4teKD2ENFpTQ f5KQ== |
|
Hi,
We recently experienced a kernel hang on a CentOs Linux 7.2.1511 server which was operating as a Nexus artefacts repository. I believe the hang is file system related and am looking for assistance in diagnosing the root cause of the issue and whether there is a resolution for it. I am unable to reproduce the hang. It has only happened once so far. The server is operating properly at the moment. Below are the requested details in the XFS FAQ for reporting problems and if any further information is required, please ask. TIA and regards, Jinesh --------------------------------------- console output when the server was hung --------------------------------------- [4120921.739069] INFO: task kthreadd:2 blocked for more than 120 seconds. [4120921.744848] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120921.751254] kthreadd    ÂD ffffffffffffffff   0   2   Â0 0x00000000 [4120921.757324] Âffff8802041734e0 0000000000000046 ffff8802040f0b80 ffff880204173fd8 [4120921.763417] Âffff880204173fd8 ffff880204173fd8 ffff8802040f0b80 ffff880204173628 [4120921.769465] Âffff880204173630 7fffffffffffffff ffff8802040f0b80 ffffffffffffffff [4120921.775993] Call Trace: [4120921.780164] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120921.784900] Â[<ffffffff81638e29>] schedule_timeout+0x209/0x2d0 [4120921.790127] Â[<ffffffff810b5df9>] ? ttwu_do_wakeup+0x19/0xd0 [4120921.796310] Â[<ffffffff810b5f8d>] ? ttwu_do_activate.constprop.84+0x5d/0x70 [4120921.801870] Â[<ffffffff810b8726>] ? try_to_wake_up+0x1b6/0x300 [4120921.806965] Â[<ffffffff8163b506>] wait_for_completion+0x116/0x170 [4120921.811843] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120921.816596] Â[<ffffffff8109e7ac>] flush_work+0xfc/0x1c0 [4120921.821211] Â[<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90 [4120921.826269] Â[<ffffffffa015b43a>] xlog_cil_force_lsn+0x8a/0x210 [xfs] [4120921.831431] Â[<ffffffffa0159a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs] [4120921.836699] Â[<ffffffff816322f5>] ? __slab_free+0x10e/0x277 [4120921.841314] Â[<ffffffffa0159d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs] [4120921.846290] Â[<ffffffffa014bfc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs] [4120921.851084] Â[<ffffffffa01484b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs] [4120921.855634] Â[<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40 [4120921.860379] Â[<ffffffffa014bfc9>] xfs_iunpin_wait+0x19/0x20 [xfs] [4120921.864993] Â[<ffffffffa014084c>] xfs_reclaim_inode+0x8c/0x350 [xfs] [4120921.869706] Â[<ffffffffa0140d77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs] [4120921.874552] Â[<ffffffff810b5f8d>] ? ttwu_do_activate.constprop.84+0x5d/0x70 [4120921.879503] Â[<ffffffff810b8726>] ? try_to_wake_up+0x1b6/0x300 [4120921.883890] Â[<ffffffff810b8893>] ? wake_up_process+0x23/0x40 [4120921.888214] Â[<ffffffffa0141923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [4120921.892889] Â[<ffffffffa0150895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs] [4120921.897710] Â[<ffffffff811e0ea8>] prune_super+0xe8/0x170 [4120921.901538] Â[<ffffffff8117c795>] shrink_slab+0x165/0x300 [4120921.905558] Â[<ffffffff811d60d1>] ? vmpressure+0x21/0x90 [4120921.909477] Â[<ffffffff8117f912>] do_try_to_free_pages+0x3c2/0x4e0 [4120921.913752] Â[<ffffffff8117fb2c>] try_to_free_pages+0xfc/0x180 [4120921.917891] Â[<ffffffff8117382d>] __alloc_pages_nodemask+0x7fd/0xb90 [4120921.922224] Â[<ffffffff81078d73>] copy_process.part.25+0x163/0x1610 [4120921.926508] Â[<ffffffff810c226e>] ? dequeue_task_fair+0x42e/0x640 [4120921.930581] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120921.934837] Â[<ffffffff8107a401>] do_fork+0xe1/0x320 [4120921.938364] Â[<ffffffff8107a666>] kernel_thread+0x26/0x30 [4120921.942056] Â[<ffffffff810a65f2>] kthreadd+0x2b2/0x2f0 [4120921.945527] Â[<ffffffff810a6340>] ? kthread_create_on_cpu+0x60/0x60 [4120921.949393] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120921.953247] Â[<ffffffff810a6340>] ? kthread_create_on_cpu+0x60/0x60 [4120921.957337] INFO: task kswapd0:63 blocked for more than 120 seconds. [4120921.961229] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120921.965814] kswapd0     D ffffffffffffffff   0  Â63   Â2 0x00000000 [4120921.970177] Âffff8801edebf6f0 0000000000000046 ffff88020433c500 ffff8801edebffd8 [4120921.974816] Âffff8801edebffd8 ffff8801edebffd8 ffff88020433c500 ffff8801edebf838 [4120921.979486] Âffff8801edebf840 7fffffffffffffff ffff88020433c500 ffffffffffffffff [4120921.984171] Call Trace: [4120921.986327] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120921.989613] Â[<ffffffff81638e29>] schedule_timeout+0x209/0x2d0 [4120921.993136] Â[<ffffffff810b5df9>] ? ttwu_do_wakeup+0x19/0xd0 [4120921.996564] Â[<ffffffff810b5f8d>] ? ttwu_do_activate.constprop.84+0x5d/0x70 [4120922.000534] Â[<ffffffff810b8726>] ? try_to_wake_up+0x1b6/0x300 [4120922.004365] Â[<ffffffff8163b506>] wait_for_completion+0x116/0x170 [4120922.008129] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.011631] Â[<ffffffff8109e7ac>] flush_work+0xfc/0x1c0 [4120922.015008] Â[<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90 [4120922.018747] Â[<ffffffffa015b43a>] xlog_cil_force_lsn+0x8a/0x210 [xfs] [4120922.022623] Â[<ffffffff81171abb>] ? free_pcppages_bulk+0x34b/0x3a0 [4120922.026463] Â[<ffffffffa0159a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs] [4120922.030559] Â[<ffffffff816322f5>] ? __slab_free+0x10e/0x277 [4120922.034123] Â[<ffffffffa0159d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs] [4120922.037986] Â[<ffffffffa014bfc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs] [4120922.041840] Â[<ffffffffa01484b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs] [4120922.045748] Â[<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40 [4120922.049556] Â[<ffffffffa014bfc9>] xfs_iunpin_wait+0x19/0x20 [xfs] [4120922.053515] Â[<ffffffffa014084c>] xfs_reclaim_inode+0x8c/0x350 [xfs] [4120922.057429] Â[<ffffffffa0140d77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs] [4120922.061637] Â[<ffffffffa0141923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [4120922.065723] Â[<ffffffffa0150895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs] [4120922.069902] Â[<ffffffff811e0ea8>] prune_super+0xe8/0x170 [4120922.073288] Â[<ffffffff8117c795>] shrink_slab+0x165/0x300 [4120922.076715] Â[<ffffffff811d6111>] ? vmpressure+0x61/0x90 [4120922.080154] Â[<ffffffff81180401>] balance_pgdat+0x4b1/0x5e0 [4120922.083522] Â[<ffffffff811806a3>] kswapd+0x173/0x450 [4120922.086630] Â[<ffffffff810a6ae0>] ? wake_up_atomic_t+0x30/0x30 [4120922.090186] Â[<ffffffff81180530>] ? balance_pgdat+0x5e0/0x5e0 [4120922.093399] Â[<ffffffff810a5aef>] kthread+0xcf/0xe0 [4120922.096164] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.102513] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120922.105472] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.108855] INFO: task xfsaild/xvdb1:499 blocked for more than 120 seconds. [4120922.112411] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120922.116496] xfsaild/xvdb1  D ffffffffffffffff   0  499   Â2 0x00000000 [4120922.120931] Âffff8801ec897a80 0000000000000046 ffff8801ed827300 ffff8801ec897fd8 [4120922.126370] Âffff8801ec897fd8 ffff8801ec897fd8 ffff8801ed827300 ffff8801ec897bd0 [4120922.131959] Âffff8801ec897bd8 7fffffffffffffff ffff8801ed827300 ffffffffffffffff [4120922.137824] Call Trace: [4120922.140429] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120922.144426] Â[<ffffffff81638e29>] schedule_timeout+0x209/0x2d0 [4120922.148294] Â[<ffffffff81009ee0>] ? xen_clocksource_read+0x20/0x30 [4120922.152181] Â[<ffffffff8101cd69>] ? sched_clock+0x9/0x10 [4120922.155625] Â[<ffffffff812d0876>] ? __blk_segment_map_sg+0x56/0x1b0 [4120922.159579] Â[<ffffffff810b8726>] ? try_to_wake_up+0x1b6/0x300 [4120922.163346] Â[<ffffffff8163b506>] wait_for_completion+0x116/0x170 [4120922.166567] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.169550] Â[<ffffffff8109e7ac>] flush_work+0xfc/0x1c0 [4120922.172533] Â[<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90 [4120922.175723] Â[<ffffffffa015b43a>] xlog_cil_force_lsn+0x8a/0x210 [xfs] [4120922.179179] Â[<ffffffffa0138d1c>] ? xfs_buf_read_map+0x2c/0x140 [xfs] [4120922.183095] Â[<ffffffffa0159720>] _xfs_log_force+0x70/0x290 [xfs] [4120922.186887] Â[<ffffffffa0159966>] xfs_log_force+0x26/0x80 [xfs] [4120922.190574] Â[<ffffffffa014ce8a>] xfs_iflush+0x22a/0x240 [xfs] [4120922.194204] Â[<ffffffffa015d76e>] xfs_inode_item_push+0xee/0x150 [xfs] [4120922.198158] Â[<ffffffffa0164470>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [4120922.202174] Â[<ffffffffa0164771>] xfsaild+0x301/0x5e0 [xfs] [4120922.205486] Â[<ffffffffa0164470>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [4120922.209462] Â[<ffffffff810a5aef>] kthread+0xcf/0xe0 [4120922.213098] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.217013] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120922.220426] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.224358] INFO: task kworker/1:0H:3437 blocked for more than 120 seconds. [4120922.228432] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120922.232937] kworker/1:0H  ÂD ffffffffffffffff   0 Â3437   Â2 0x00000080 [4120922.237282] Workqueue: xfs-log/xvdb1 xfs_log_worker [xfs] [4120922.240825] Âffff8800d287bb30 0000000000000046 ffff880119b18b80 ffff8800d287bfd8 [4120922.245511] Âffff8800d287bfd8 ffff8800d287bfd8 ffff880119b18b80 ffff8800d287bc78 [4120922.250044] Âffff8800d287bc80 7fffffffffffffff ffff880119b18b80 ffffffffffffffff [4120922.255485] Call Trace: [4120922.260678] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120922.266190] Â[<ffffffff81638e29>] schedule_timeout+0x209/0x2d0 [4120922.269977] Â[<ffffffff810b5dc5>] ? check_preempt_curr+0x85/0xa0 [4120922.274092] Â[<ffffffff810b5df9>] ? ttwu_do_wakeup+0x19/0xd0 [4120922.279267] Â[<ffffffff810b8726>] ? try_to_wake_up+0x1b6/0x300 [4120922.284441] Â[<ffffffff8163b506>] wait_for_completion+0x116/0x170 [4120922.298485] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.302237] Â[<ffffffff8109e7ac>] flush_work+0xfc/0x1c0 [4120922.305656] Â[<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90 [4120922.309291] Â[<ffffffffa015b43a>] xlog_cil_force_lsn+0x8a/0x210 [xfs] [4120922.313239] Â[<ffffffff810c1a26>] ? dequeue_entity+0x106/0x520 [4120922.317172] Â[<ffffffffa0159720>] _xfs_log_force+0x70/0x290 [xfs] [4120922.321545] Â[<ffffffff810c226e>] ? dequeue_task_fair+0x42e/0x640 [4120922.326162] Â[<ffffffff810bb768>] ? sched_clock_cpu+0x98/0xc0 [4120922.330583] Â[<ffffffff81013588>] ? __switch_to+0xf8/0x4b0 [4120922.334868] Â[<ffffffffa0159966>] xfs_log_force+0x26/0x80 [xfs] [4120922.339370] Â[<ffffffffa01599e4>] xfs_log_worker+0x24/0x50 [xfs] [4120922.343136] Â[<ffffffff8109d5fb>] process_one_work+0x17b/0x470 [4120922.347247] Â[<ffffffff8109e3cb>] worker_thread+0x11b/0x400 [4120922.350965] Â[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [4120922.354513] Â[<ffffffff810a5aef>] kthread+0xcf/0xe0 [4120922.357578] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.369011] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120922.373843] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.379463] INFO: task kworker/u30:2:14068 blocked for more than 120 seconds. [4120922.384006] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120922.387872] kworker/u30:2  D ffffffffffffffff   0 14068   Â2 0x00000080 [4120922.391575] Workqueue: writeback bdi_writeback_workfn (flush-202:16) [4120922.395090] Âffff8800630e7210 0000000000000046 ffff8800c11f6780 ffff8800630e7fd8 [4120922.399023] Âffff8800630e7fd8 ffff8800630e7fd8 ffff8800c11f6780 ffff8800630e7360 [4120922.402933] Âffff8800630e7368 7fffffffffffffff ffff8800c11f6780 ffffffffffffffff [4120922.406926] Call Trace: [4120922.408912] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120922.411795] Â[<ffffffff81638e29>] schedule_timeout+0x209/0x2d0 [4120922.415046] Â[<ffffffff810b5df9>] ? ttwu_do_wakeup+0x19/0xd0 [4120922.418146] Â[<ffffffff810b5f8d>] ? ttwu_do_activate.constprop.84+0x5d/0x70 [4120922.421720] Â[<ffffffff810b8726>] ? try_to_wake_up+0x1b6/0x300 [4120922.425018] Â[<ffffffff8163b506>] wait_for_completion+0x116/0x170 [4120922.428241] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.431397] Â[<ffffffff8109e7ac>] flush_work+0xfc/0x1c0 [4120922.434393] Â[<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90 [4120922.437597] Â[<ffffffffa015b43a>] xlog_cil_force_lsn+0x8a/0x210 [xfs] [4120922.441499] Â[<ffffffffa0156127>] ? kmem_zone_alloc+0x77/0x100 [xfs] [4120922.445773] Â[<ffffffffa0159720>] _xfs_log_force+0x70/0x290 [xfs] [4120922.449958] Â[<ffffffffa0137b60>] ? xfs_buf_rele+0x40/0x100 [xfs] [4120922.453640] Â[<ffffffffa0159966>] xfs_log_force+0x26/0x80 [xfs] [4120922.457413] Â[<ffffffffa00fa29f>] xfs_alloc_ag_vextent_near+0x36f/0xa60 [xfs] [4120922.461804] Â[<ffffffffa00fb20d>] xfs_alloc_ag_vextent+0xcd/0x110 [xfs] [4120922.465858] Â[<ffffffffa00fbf89>] xfs_alloc_vextent+0x429/0x5e0 [xfs] [4120922.469691] Â[<ffffffffa010c37f>] xfs_bmap_btalloc+0x3af/0x710 [xfs] [4120922.473465] Â[<ffffffffa010c6ee>] xfs_bmap_alloc+0xe/0x10 [xfs] [4120922.477311] Â[<ffffffffa010d148>] xfs_bmapi_write+0x4d8/0xa90 [xfs] [4120922.480941] Â[<ffffffffa0145c9b>] xfs_iomap_write_allocate+0x14b/0x350 [xfs] [4120922.485629] Â[<ffffffffa0130dc6>] xfs_map_blocks+0x1c6/0x230 [xfs] [4120922.489841] Â[<ffffffffa0131fe3>] xfs_vm_writepage+0x193/0x5d0 [xfs] [4120922.500558] Â[<ffffffff81173d33>] __writepage+0x13/0x50 [4120922.506407] Â[<ffffffff81174851>] write_cache_pages+0x251/0x4d0 [4120922.512752] Â[<ffffffff81173d20>] ? global_dirtyable_memory+0x70/0x70 [4120922.517334] Â[<ffffffff81174b1d>] generic_writepages+0x4d/0x80 [4120922.522017] Â[<ffffffffa0131993>] xfs_vm_writepages+0x43/0x50 [xfs] [4120922.526214] Â[<ffffffff81175bce>] do_writepages+0x1e/0x40 [4120922.530955] Â[<ffffffff812086b0>] __writeback_single_inode+0x40/0x220 [4120922.536643] Â[<ffffffff8120911e>] writeback_sb_inodes+0x25e/0x420 [4120922.542013] Â[<ffffffff8120937f>] __writeback_inodes_wb+0x9f/0xd0 [4120922.547375] Â[<ffffffff81209bc3>] wb_writeback+0x263/0x2f0 [4120922.552294] Â[<ffffffff811f89dc>] ? get_nr_inodes+0x4c/0x70 [4120922.557162] Â[<ffffffff8120be4b>] bdi_writeback_workfn+0x2cb/0x460 [4120922.562484] Â[<ffffffff8109d5fb>] process_one_work+0x17b/0x470 [4120922.567628] Â[<ffffffff8109e3cb>] worker_thread+0x11b/0x400 [4120922.572471] Â[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [4120922.577574] Â[<ffffffff810a5aef>] kthread+0xcf/0xe0 [4120922.582063] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.587646] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120922.592440] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.597920] INFO: task kworker/1:2H:14981 blocked for more than 120 seconds. [4120922.603822] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120922.610202] kworker/1:2H  ÂD ffff88001d3e3dd8   0 14981   Â2 0x00000080 [4120922.616222] Âffff88001d3e3bf0 0000000000000046 ffff88005dbf9700 ffff88001d3e3fd8 [4120922.622752] Âffff88001d3e3fd8 ffff88001d3e3fd8 ffff88005dbf9700 ffff88001d3e3d58 [4120922.629212] Âffff88001d3e3d60 7fffffffffffffff ffff88005dbf9700 ffff88001d3e3dd8 [4120922.635607] Call Trace: [4120922.638373] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120922.642868] Â[<ffffffff81638e29>] schedule_timeout+0x209/0x2d0 [4120922.647988] Â[<ffffffff813a3c3b>] ? xen_send_IPI_one+0x2b/0x30 [4120922.652966] Â[<ffffffff81011ba0>] ? xen_smp_send_reschedule+0x10/0x20 [4120922.658465] Â[<ffffffff810b5128>] ? resched_task+0xb8/0xd0 [4120922.663425] Â[<ffffffff810b5db5>] ? check_preempt_curr+0x75/0xa0 [4120922.668687] Â[<ffffffff810b5df9>] ? ttwu_do_wakeup+0x19/0xd0 [4120922.673715] Â[<ffffffff8163b506>] wait_for_completion+0x116/0x170 [4120922.678970] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.683886] Â[<ffffffff810a5988>] kthread_create_on_node+0xa8/0x140 [4120922.689305] Â[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [4120922.694460] Â[<ffffffff8109d9da>] create_worker+0xea/0x250 [4120922.699194] Â[<ffffffff8109dcd6>] manage_workers.isra.24+0xf6/0x2d0 [4120922.704599] Â[<ffffffff8109e5e9>] worker_thread+0x339/0x400 [4120922.709613] Â[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [4120922.714725] Â[<ffffffff810a5aef>] kthread+0xcf/0xe0 [4120922.719205] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.724918] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120922.729775] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.735513] INFO: task kworker/u30:0:15076 blocked for more than 120 seconds. [4120922.741555] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120922.747874] kworker/u30:0  D ffff8800dc79fdd8   0 15076   Â2 0x00000080 [4120922.753364] Âffff8800dc79fbf0 0000000000000046 ffff88005dbfdc00 ffff8800dc79ffd8 [4120922.758892] Âffff8800dc79ffd8 ffff8800dc79ffd8 ffff88005dbfdc00 ffff8800dc79fd58 [4120922.763731] Âffff8800dc79fd60 7fffffffffffffff ffff88005dbfdc00 ffff8800dc79fdd8 [4120922.768429] Call Trace: [4120922.770718] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120922.774272] Â[<ffffffff81638e29>] schedule_timeout+0x209/0x2d0 [4120922.777813] Â[<ffffffff810b5dc5>] ? check_preempt_curr+0x85/0xa0 [4120922.781779] Â[<ffffffff810b5df9>] ? ttwu_do_wakeup+0x19/0xd0 [4120922.784985] Â[<ffffffff8163b506>] wait_for_completion+0x116/0x170 [4120922.788227] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.791261] Â[<ffffffff810a5988>] kthread_create_on_node+0xa8/0x140 [4120922.795287] Â[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [4120922.798776] Â[<ffffffff8109d9da>] create_worker+0xea/0x250 [4120922.801800] Â[<ffffffff8109dcd6>] manage_workers.isra.24+0xf6/0x2d0 [4120922.805166] Â[<ffffffff8109e5e9>] worker_thread+0x339/0x400 [4120922.810183] Â[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [4120922.814169] Â[<ffffffff810a5aef>] kthread+0xcf/0xe0 [4120922.817285] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.821596] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120922.825714] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.829190] INFO: task kworker/1:2:15134 blocked for more than 120 seconds. [4120922.832931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120922.836876] kworker/1:2   D ffff880034f0d500   0 15134   Â2 0x00000080 [4120922.842589] Workqueue: xfs-cil/xvdb1 xlog_cil_push_work [xfs] [4120922.847416] Âffff880075483be8 0000000000000046 ffff880034365080 ffff880075483fd8 [4120922.852258] Âffff880075483fd8 ffff880075483fd8 ffff880034365080 ffff880034f0d400 [4120922.856752] Âffff880034f0d528 ffff88013ca56170 ffff880034365080 ffff880034f0d500 [4120922.860940] Call Trace: [4120922.863004] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120922.867417] Â[<ffffffffa0158b9c>] xlog_state_get_iclog_space+0xec/0x2f0 [xfs] [4120922.873230] Â[<ffffffffa0158815>] ? xlog_sync+0x285/0x430 [xfs] [4120922.877108] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.880725] Â[<ffffffffa0159077>] xlog_write+0x1b7/0x700 [xfs] [4120922.884418] Â[<ffffffffa015abd9>] xlog_cil_push+0x239/0x420 [xfs] [4120922.887771] Â[<ffffffffa015add5>] xlog_cil_push_work+0x15/0x20 [xfs] [4120922.891643] Â[<ffffffff8109d5fb>] process_one_work+0x17b/0x470 [4120922.896259] Â[<ffffffff8109e543>] worker_thread+0x293/0x400 [4120922.900676] Â[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [4120922.904386] Â[<ffffffff810a5aef>] kthread+0xcf/0xe0 [4120922.907220] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.910680] Â[<ffffffff81646118>] ret_from_fork+0x58/0x90 [4120922.913985] Â[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [4120922.917758] INFO: task xfs_db:15150 blocked for more than 120 seconds. [4120922.922778] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [4120922.928450] xfs_db     ÂD ffff8801ed87cb00   0 15150 Â15149 0x00000080 [4120922.932936] Âffff88003fc3fd48 0000000000000086 ffff8801f0cf1700 ffff88003fc3ffd8 [4120922.937513] Âffff88003fc3ffd8 ffff88003fc3ffd8 ffff8801f0cf1700 ffff880035f1b128 [4120922.942401] Âffff8801f0cf1700 0000000000000000 ffff880035f1b000 ffff8801ed87cb00 [4120922.946982] Call Trace: [4120922.949242] Â[<ffffffff8163b139>] schedule+0x29/0x70 [4120922.952840] Â[<ffffffffa0159842>] _xfs_log_force+0x192/0x290 [xfs] [4120922.957966] Â[<ffffffff810b88d0>] ? wake_up_state+0x20/0x20 [4120922.962437] Â[<ffffffffa0159966>] xfs_log_force+0x26/0x80 [xfs] [4120922.965795] Â[<ffffffffa01510ad>] xfs_fs_sync_fs+0x2d/0x50 [xfs] [4120922.969715] Â[<ffffffff8120fcf2>] sync_filesystem+0x72/0xa0 [4120922.973197] Â[<ffffffff8121a344>] fsync_bdev+0x24/0x60 [4120922.976962] Â[<ffffffff812d937b>] blkdev_ioctl+0x3bb/0x760 [4120922.980871] Â[<ffffffff812199a1>] block_ioctl+0x41/0x50 [4120922.984190] Â[<ffffffff811f2175>] do_vfs_ioctl+0x2e5/0x4c0 [4120922.987944] Â[<ffffffff8128c01e>] ? file_has_perm+0xae/0xc0 [4120922.991108] Â[<ffffffff811f23f1>] SyS_ioctl+0xa1/0xc0 [4120922.994021] Â[<ffffffff816461c9>] system_call_fastpath+0x16/0x1b -------------- Kernel version -------------- Linux version 3.10.0-327.18.2.el7.x86_64 (builder@xxxxxxxxxxxxxxxxxxxxxxx) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Thu May 12 11:03 :55 UTC 2016 ---------------- xfsprogs version ---------------- xfs_repair version 3.2.2 -------------- number of CPUs -------------- Â# Âcat /proc/cpuinfo processor    : 0 vendor_id    : GenuineIntel cpu family   Â: 6 model      : 63 model name   Â: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz stepping    Â: 2 microcode    : 0x25 cpu MHz     : 2400.088 cache size   Â: 30720 KB physical id   : 0 siblings    Â: 2 core id     : 0 cpu cores    : 1 apicid     Â: 0 initial apicid Â: 0 fpu       : yes fpu_exception  : yes cpuid level   : 13 wp       Â: yes flags      : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt bogomips    Â: 4800.17 clflush size  Â: 64 cache_alignment : 64 address sizes  : 46 bits physical, 48 bits virtual power management: processor    : 1 vendor_id    : GenuineIntel cpu family   Â: 6 model      : 63 model name   Â: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz stepping    Â: 2 microcode    : 0x25 cpu MHz     : 2400.088 cache size   Â: 30720 KB physical id   : 0 siblings    Â: 2 core id     : 0 cpu cores    : 1 apicid     Â: 1 initial apicid Â: 1 fpu       : yes fpu_exception  : yes cpuid level   : 13 wp       Â: yes flags      : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt bogomips    Â: 4800.17 clflush size  Â: 64 cache_alignment : 64 address sizes  : 46 bits physical, 48 bits virtual power management: ------------------------- contents of /proc/meminfo ------------------------- Â# Âcat /proc/meminfo MemTotal:    Â7747768 kB MemFree:     5771240 kB MemAvailable:  Â6161328 kB Buffers:       460 kB Cached:      559916 kB SwapCached:      Â0 kB Active:     Â1459048 kB Inactive:     314992 kB Active(anon):  Â1217424 kB Inactive(anon):  Â17100 kB Active(file):   241624 kB Inactive(file):  297892 kB Unevictable:    Â8644 kB Mlocked:      Â8644 kB SwapTotal:    8388604 kB SwapFree:    Â8388604 kB Dirty:       Â1536 kB Writeback:       0 kB AnonPages:    1222356 kB Mapped:      Â47524 kB Shmem:       17264 kB Slab:       Â70744 kB SReclaimable:   Â39000 kB SUnreclaim:    Â31744 kB KernelStack:    Â5616 kB PageTables:     8696 kB NFS_Unstable:     Â0 kB Bounce:        Â0 kB WritebackTmp:     Â0 kB CommitLimit:  Â12262488 kB Committed_AS:  Â3412392 kB VmallocTotal:  34359738367 kB VmallocUsed:    27068 kB VmallocChunk:  34359697144 kB HardwareCorrupted:   0 kB AnonHugePages:  1040384 kB HugePages_Total:    0 HugePages_Free:    Â0 HugePages_Rsvd:    Â0 HugePages_Surp:    Â0 Hugepagesize:    2048 kB DirectMap4k:    75776 kB DirectMap2M:   8312832 kB ------------------------ contents of /proc/mounts ------------------------ Â# Âcat /proc/mounts rootfs / rootfs rw 0 0 sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=3986336k,nr_inodes=996584,mode=755 0 0 securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0 tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0 devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0 tmpfs /sys/fs/cgroup tmpfs ro,seclabel,nosuid,nodev,noexec,mode=755 0 0 cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0 pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0 cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0 cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0 cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0 cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0 configfs /sys/kernel/config configfs rw,relatime 0 0 /dev/xvda1 / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0 rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0 systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=25,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime 0 0 mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0 debugfs /sys/kernel/debug debugfs rw,relatime 0 0 nfsd /proc/fs/nfsd nfsd rw,relatime 0 0 /dev/xvdb1 /opt xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0 tmpfs /run/user/0 tmpfs rw,seclabel,nosuid,nodev,relatime,size=774780k,mode=700 0 0 ---------------------------- contents of /proc/partitions ---------------------------- Â# Âcat /proc/partitions major minor Â#blocks Âname Â202    Â0  16777216 xvda Â202    Â1  16770836 xvda1 Â202    16 2147483648 xvdb Â202    17 2147483616 xvdb1 -------------------------------------- RAID layout (hardware and/or software) -------------------------------------- N/A ----------------- LVM configuration ----------------- N/A --------------------------- type of disks you are using --------------------------- Amazon Elastic Block Storage - General Purpose SSD ---------------------------- write cache status of drives ---------------------------- N/A -------------------------------------- size of BBWC and mode it is running in -------------------------------------- N/A --------------------------------------------- xfs_info output on the filesystem in question --------------------------------------------- Â# Âxfs_info /dev/xvdb1 meta-data=""       isize=256  Âagcount=8, agsize=67108862 blks     Â=            sectsz=512  attr=2, projid32bit=1     Â=            crc=0    Âfinobt=0 data   =            bsize=4096  blocks=536870896, imaxpct=25     Â=            sunit=0   Âswidth=0 blks naming  =version 2       Âbsize=4096  ascii-ci=0 ftype=0 log   Â=internal        bsize=4096  blocks=131071, version=2     Â=            sectsz=512  sunit=0 blks, lazy-count=1 realtime =none          extsz=4096  blocks=0, rtextents=0 -------------------------------- dmesg output (/var/log/messages) -------------------------------- Note, I know that the messages below refer to the XVDA1 drive and not the XVDB1 drive whose filesystem had hung. Jul 18 09:19:29 ip-172-31-18-53 kernel: XFS (xvda1): Metadata corruption detected at xfs_attr3_leaf_read_verify+0x5e/0x110 [xfs], block 0x14277a0 Jul 18 09:19:29 ip-172-31-18-53 kernel: XFS (xvda1): Unmount and run xfs_repair Jul 18 09:19:29 ip-172-31-18-53 kernel: XFS (xvda1): First 64 bytes of corrupted metadata buffer: Jul 18 09:19:29 ip-172-31-18-53 kernel: ffff8801f01a9000: 31 34 33 34 39 38 31 34 33 32 34 39 33 30 3d 31 Â14349814324930=1 Jul 18 09:19:29 ip-172-31-18-53 kernel: ffff8801f01a9010: 34 36 38 36 36 37 37 35 33 0a 31 34 33 34 39 38 Â468667753.143498 Jul 18 09:19:29 ip-172-31-18-53 kernel: ffff8801f01a9020: 31 34 32 38 34 37 30 37 3d 31 34 36 38 36 36 36 Â14284707=1468666 Jul 18 09:19:29 ip-172-31-18-53 kernel: ffff8801f01a9030: 38 32 32 0a 00 00 00 00 00 00 00 00 00 00 00 00 Â822............. Jul 18 09:19:29 ip-172-31-18-53 kernel: XFS (xvda1): metadata I/O error: block 0x14277a0 ("xfs_trans_read_buf_map") error 117 numblks 8 Jul 18 09:19:29 ip-172-31-18-53 kernel: SELinux: inode_doinit_with_dentry: Âgetxattr returned 117 for dev=xvda1 ino=42271394 Jul 18 09:19:30 ip-172-31-18-53 kernel: XFS (xvda1): Metadata corruption detected at xfs_da3_node_read_verify+0x5f/0x150 [xfs], block 0x14277a0 Jul 18 09:19:30 ip-172-31-18-53 kernel: XFS (xvda1): Unmount and run xfs_repair Jul 18 09:19:30 ip-172-31-18-53 kernel: XFS (xvda1): First 64 bytes of corrupted metadata buffer: Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f0139000: 31 34 33 34 39 38 31 34 33 32 34 39 33 30 3d 31 Â14349814324930=1 Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f0139010: 34 36 38 36 36 37 37 35 33 0a 31 34 33 34 39 38 Â468667753.143498 Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f0139020: 31 34 32 38 34 37 30 37 3d 31 34 36 38 36 36 36 Â14284707=1468666 Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f0139030: 38 32 32 0a 00 00 00 00 00 00 00 00 00 00 00 00 Â822............. Jul 18 09:19:30 ip-172-31-18-53 kernel: XFS (xvda1): Metadata corruption detected at xfs_attr3_leaf_read_verify+0x5e/0x110 [xfs], block 0x1b0 Jul 18 09:19:30 ip-172-31-18-53 kernel: XFS (xvda1): Unmount and run xfs_repair Jul 18 09:19:30 ip-172-31-18-53 kernel: XFS (xvda1): First 64 bytes of corrupted metadata buffer: Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f013b000: 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 Â<?xml version="1 Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f013b010: 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 Â.0" encoding="UT Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f013b020: 46 2d 38 22 3f 3e 0a 3c 72 65 70 6f 6d 64 20 78 ÂF-8"?>.<repomd x Jul 18 09:19:30 ip-172-31-18-53 kernel: ffff8801f013b030: 6d 6c 6e 73 3d 22 68 74 74 70 3a 2f 2f 6c 69 6e Âmlns="http://lin Jul 18 09:19:30 ip-172-31-18-53 kernel: XFS (xvda1): metadata I/O error: block 0x1b0 ("xfs_trans_read_buf_map") error 117 numblks 8 Jul 18 09:19:30 ip-172-31-18-53 kernel: SELinux: inode_doinit_with_dentry: Âgetxattr returned 117 for dev=xvda1 ino=2943 Jul 18 09:19:31 ip-172-31-18-53 kernel: XFS (xvda1): Metadata corruption detected at xfs_da3_node_read_verify+0x5f/0x150 [xfs], block 0x1b0 Jul 18 09:19:31 ip-172-31-18-53 kernel: XFS (xvda1): Unmount and run xfs_repair Jul 18 09:19:31 ip-172-31-18-53 kernel: XFS (xvda1): First 64 bytes of corrupted metadata buffer: Jul 18 09:19:31 ip-172-31-18-53 kernel: ffff8800e91fc000: 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 Â<?xml version="1 Jul 18 09:19:31 ip-172-31-18-53 kernel: ffff8800e91fc010: 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 Â.0" encoding="UT Jul 18 09:19:31 ip-172-31-18-53 kernel: ffff8800e91fc020: 46 2d 38 22 3f 3e 0a 3c 72 65 70 6f 6d 64 20 78 ÂF-8"?>.<repomd x Jul 18 09:19:31 ip-172-31-18-53 kernel: ffff8800e91fc030: 6d 6c 6e 73 3d 22 68 74 74 70 3a 2f 2f 6c 69 6e Âmlns="http://lin ------------------------------------------- Running xfs_repair on the /dev/xvdb1 volume ------------------------------------------- # Âxfs_repair /dev/xvdb1 Phase 1 - find and verify superblock... Phase 2 - using internal log     - zero log...     - scan filesystem freespace and inode maps...     - found root inode chunk Phase 3 - for each AG...     - scan and clear agi unlinked lists...     - process known inodes and perform inode discovery...     - agno = 0     - agno = 1     - agno = 2     - agno = 3     - agno = 4     - agno = 5     - agno = 6     - agno = 7     - process newly discovered inodes... Phase 4 - check for duplicate blocks...     - setting up duplicate extent list...     - check for inodes claiming duplicate blocks...     - agno = 0     - agno = 1     - agno = 2     - agno = 3     - agno = 4     - agno = 5     - agno = 6     - agno = 7 Phase 5 - rebuild AG headers and trees...     - reset superblock... Phase 6 - check inode connectivity...     - resetting contents of realtime bitmap and summary inodes     - traversing filesystem ...     - traversal finished ...     - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done At the moment, I can't run xfs_repair on the XVDA1 file system since its an AWS EC2 root EBS volume with market codes on it and I can't mount it as a secondary disk in another instance to run xfs_repair on it. |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: mkfs.xfs -d su=XXX,sw=YYY and future volume resize, Carlos Maiolino |
|---|---|
| Next by Date: | Re: iomap infrastructure and multipage writes V5, Dave Chinner |
| Previous by Thread: | Re: [PATCH 3/3] xfs: remove __arch_pack, Dave Chinner |
| Next by Thread: | Re: iomap infrastructure and multipage writes V5, Dave Chinner |
| Indexes: | [Date] [Thread] [Top] [All Lists] |