xfs
[Top] [All Lists]

Failing XFS filesystem underlying Ceph OSDs

To: xfs@xxxxxxxxxxx
Subject: Failing XFS filesystem underlying Ceph OSDs
From: Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx>
Date: Fri, 3 Jul 2015 05:07:29 -0400
Delivered-to: xfs@xxxxxxxxxxx
Hello, we are seeing this and similar errors on multiple Supermicro nodes running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1

Thank you for any info and troubleshooting advice.

Alex Gorbachev

Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.261899] BUG: unable to handle kernel paging request at 000000190000001c
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.261923] IP: [<ffffffff8118e476>] find_get_entries+0x66/0x160
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.261941] PGD 1035954067 PUD 0Â
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.261955] Oops: 0000 [#1] SMPÂ
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.261969] Modules linked in: xfs libcrc32c ipmi_ssif intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac edac_core lpc_ich joydev mei_me mei ioatdma wmi 8021q ipmi_si garp 8250_fintek mrp ipmi_msghandler stp llc bonding mac_hid lp parport mlx4_en vxlan ip6_udp_tunnel udp_tunnel hid_generic usbhid hid igb ahci mpt2sas mlx4_core i2c_algo_bit libahci dca raid_class ptp scsi_transport_sas pps_core arcmsr
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262182] CPU: 10 PID: 8711 Comm: ceph-osd Not tainted 4.1.0-040100-generic #201506220235
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262197] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262215] task: ffff8800721f1420 ti: ffff880fbad54000 task.ti: ffff880fbad54000
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262229] RIP: 0010:[<ffffffff8118e476>] Â[<ffffffff8118e476>] find_get_entries+0x66/0x160
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262248] RSP: 0018:ffff880fbad571a8 ÂEFLAGS: 00010246
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262258] RAX: ffff880004000158 RBX: 000000000000000e RCX: 0000000000000000
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262303] RDX: ffff880004000158 RSI: ffff880fbad571c0 RDI: 0000001900000000
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262347] RBP: ffff880fbad57208 R08: 00000000000000c0 R09: 00000000000000ff
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262391] R10: 0000000000000000 R11: 0000000000000220 R12: 00000000000000b6
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262435] R13: ffff880fbad57268 R14: 000000000000000a R15: ffff880fbad572d8
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262479] FS: Â00007f98cb0e0700(0000) GS:ffff88103f480000(0000) knlGS:0000000000000000
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262524] CS: Â0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262551] CR2: 000000190000001c CR3: 0000001034f0e000 CR4: 00000000000407e0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262596] Stack:
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262618] Âffff880fbad571f8 ffff880cf6076b30 ffff880bdde05da8 00000000000000e6
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262669] Â0000000000000100 ffff880cf6076b28 00000000000000b5 ffff880fbad57258
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262721] Âffff880fbad57258 ffff880fbad572d8 ffffffffffffffff ffff880cf6076b28
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262772] Call Trace:
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262801] Â[<ffffffff8119b482>] pagevec_lookup_entries+0x22/0x30
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262831] Â[<ffffffff8119bd84>] truncate_inode_pages_range+0xf4/0x700
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262862] Â[<ffffffff8119c415>] truncate_inode_pages+0x15/0x20
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262891] Â[<ffffffff8119c53f>] truncate_inode_pages_final+0x5f/0xa0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262949] Â[<ffffffffc0431c2c>] xfs_fs_evict_inode+0x3c/0xe0 [xfs]
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.262981] Â[<ffffffff81220558>] evict+0xb8/0x190
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263009] Â[<ffffffff81220671>] dispose_list+0x41/0x50
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263037] Â[<ffffffff8122176f>] prune_icache_sb+0x4f/0x60
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263067] Â[<ffffffff81208ab5>] super_cache_scan+0x155/0x1a0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263096] Â[<ffffffff8119d26f>] do_shrink_slab+0x13f/0x2c0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263126] Â[<ffffffff811a22b0>] ? shrink_lruvec+0x330/0x370
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263157] Â[<ffffffff811b4189>] ? isolate_migratepages_block+0x299/0x5c0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263188] Â[<ffffffff8119d558>] shrink_slab+0xd8/0x110
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263217] Â[<ffffffff811a25bf>] shrink_zone+0x2cf/0x300
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263246] Â[<ffffffff811b4d3d>] ? compact_zone+0x7d/0x4f0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263275] Â[<ffffffff811a2a64>] shrink_zones+0x104/0x2a0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263304] Â[<ffffffff811b53ad>] ? compact_zone_order+0x5d/0x70
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263336] Â[<ffffffff810f1666>] ? ktime_get+0x46/0xb0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263365] Â[<ffffffff811a2cd7>] do_try_to_free_pages+0xd7/0x160
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263396] Â[<ffffffff811a3017>] try_to_free_pages+0xb7/0x170
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263427] Â[<ffffffff8119571a>] __alloc_pages_nodemask+0x5ba/0x9c0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263460] Â[<ffffffff811dc9bc>] alloc_pages_current+0x9c/0x110
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263492] Â[<ffffffff811e4f2a>] allocate_slab+0x20a/0x2e0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263522] Â[<ffffffff811e5031>] new_slab+0x31/0x1f0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263553] Â[<ffffffff817f8dd9>] __slab_alloc+0x18e/0x2a3
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263584] Â[<ffffffff816d7817>] ? __alloc_skb+0x87/0x2b0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263614] Â[<ffffffff816d77e7>] ? __alloc_skb+0x57/0x2b0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263643] Â[<ffffffff811e9b7b>] __kmalloc_node_track_caller+0xbb/0x2b0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263675] Â[<ffffffff816d7817>] ? __alloc_skb+0x87/0x2b0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263704] Â[<ffffffff816d737c>] __kmalloc_reserve.isra.57+0x3c/0xa0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263734] Â[<ffffffff816d7817>] __alloc_skb+0x87/0x2b0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263766] Â[<ffffffff81737de1>] sk_stream_alloc_skb+0x41/0x130
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263796] Â[<ffffffff817388b3>] tcp_sendmsg+0x2d3/0xa90
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263827] Â[<ffffffff81764477>] inet_sendmsg+0x67/0xa0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263858] Â[<ffffffff816cea54>] ? copy_msghdr_from_user+0x154/0x1b0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263891] Â[<ffffffff816cdcfd>] sock_sendmsg+0x4d/0x60
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263920] Â[<ffffffff816cef93>] ___sys_sendmsg+0x2b3/0x2c0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263950] Â[<ffffffff810a853c>] ? ttwu_do_wakeup+0x2c/0x100
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.263979] Â[<ffffffff810a8826>] ? ttwu_do_activate.constprop.121+0x66/0x70
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264011] Â[<ffffffff810abef5>] ? try_to_wake_up+0x215/0x2a0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264040] Â[<ffffffff810abfb0>] ? wake_up_state+0x10/0x20
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264071] Â[<ffffffff810fce86>] ? wake_futex+0x76/0xb0
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264099] Â[<ffffffff810fe192>] ? futex_wake+0x72/0x140
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264127] Â[<ffffffff81222675>] ? __fget_light+0x25/0x70
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264155] Â[<ffffffff816cf9b9>] __sys_sendmsg+0x49/0x90
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264184] Â[<ffffffff816cfa19>] SyS_sendmsg+0x19/0x20
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264215] Â[<ffffffff8180d272>] system_call_fastpath+0x16/0x75
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264243] Code: 00 4c 89 65 c0 31 d2 e9 86 00 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 3a 48 85 ff 0f 84 ad 00 00 00 40 f6 c7 03 0f 85 a9 00 00 00 <8b> 4f 1c 85 c9 74 e3 8d 71 01 4c 8d 47 1c 89 c8 f0 0f b1 77 1cÂ
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264467] RIP Â[<ffffffff8118e476>] find_get_entries+0x66/0x160
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264499] ÂRSP <ffff880fbad571a8>
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264522] CR2: 000000190000001c
Jul Â3 03:42:06 roc-4r-sca020 kernel: [554036.264824] ---[ end trace ae271fe24c8d817e ]---
<Prev in Thread] Current Thread [Next in Thread>