Hi
a couple of days ago, one of our XFS pools ran full which did not
realize due to some bogged monitoring.
After deleting files, the FS is not filled to about 80-85% and file
creation as well as deletion is extremely slow:
$ cd /srv
$ time touch test
touch test 0.00s user 0.17s system 5% cpu 2.993 total
$ time touch test
touch test 0.00s user 0.00s system 0% cpu 0.002 total
$ time rm test
rm test 0.00s user 0.00s system 0% cpu 4.954 total
As we had seen some XFS error messages like XFS: Internal error
XFS_WANT_CORRUPTED_GOTO[1] shortly before the file system was completely
filled, we umounted it and ran successfully xfs_repair on it[2].
We started to dig into fragmentation levels, but to be honest, I'm not
too sure how to interpret the numbers. All seem pretty innocent to me
except the very high directory fragmentation:
# mount
[...]
/dev/sdb1 on /srv type xfs (rw,relatime,attr2,inode64,noquota)
# uname -a
Linux einstein4 3.14.24-atlas #1 SMP Wed Nov 19 08:29:23 UTC 2014 x86_64
GNU/Linux
(after the reboot to add more RAM, before 3.14.11)
# df -h /dev/sdb1
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 15T 12T 2.7T 82% /srv
# df -i /dev/sdb1
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdb1 3105468352 554441361 2551026991 18% /srv
# xfs_info /dev/sdb1
meta-data=/dev/sdb1 isize=256 agcount=15,
agsize=268435455 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=3881835478, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
(underneath is a Areca backed 16 disk raid 10 and yes, possibly we
should have gone to more AGs, hindsight is great ;))
# xfs_db -r -c "frag" /dev/sdb1
actual 562488447, ideal 553668062, fragmentation factor 1.57%
# xfs_db -r -c "freesp" /dev/sdb1
from to extents blocks pct
1 1 9669664 9669664 1.36
2 3 5634086 12956862 1.83
4 7 1720988 8320694 1.17
8 15 314377 3088702 0.44
16 31 231279 4941737 0.70
32 63 10501367 372068430 52.45
64 127 2604359 204737946 28.86
128 255 377595 63623350 8.97
256 511 58744 20198673 2.85
512 1023 8843 5759074 0.81
1024 2047 1044 1418464 0.20
2048 4095 255 685540 0.10
4096 8191 81 453953 0.06
8192 16383 14 126863 0.02
16384 32767 18 373876 0.05
32768 65535 11 435041 0.06
65536 131071 6 458767 0.06
# xfs_db -r -c "frag -d" /dev/sdb1
actual 7987218, ideal 67532, fragmentation factor 99.15%
# xfs_db -r -c "frag -f" /dev/sdb1
actual 555079796, ideal 554171759, fragmentation factor 0.16%
Finally, we still xfs error messages in the kernel logs[3].
Questions:
* Would xfs_fsr be able to cure this problem?
* Would moving to xfsprogs 3.2 and kernel > 3.15 with on disk format 5
plus crc help to protect against this "wear"?
Cheers
Carsten
PS: Please Cc me, as I'm not subscribed ATM.
[1]Jan 7 20:58:47 einstein4 kernel: [14029681.124548] XFS: Internal
error XFS_WANT_CORRUPTED_GOTO at line 1419 of file fs/xfs/xfs_alloc.c.
Caller 0xffffffffa023594d
Jan 7 20:58:47 einstein4 kernel: [14029681.124657] CPU: 5 PID: 13437
Comm: kworker/5:1 Not tainted 3.14.11-atlas #1
Jan 7 20:58:47 einstein4 kernel: [14029681.124745] Hardware name:
Supermicro X7DVL/X7DVL, BIOS 6.00 11/02/2007
Jan 7 20:58:47 einstein4 kernel: [14029681.124860] Workqueue: xfsalloc
xfs_bmapi_allocate_worker [xfs]
Jan 7 20:58:47 einstein4 kernel: [14029681.124977] 0000000000000000
ffff88013b7ebc80 ffffffff814e9555 ffff88017022bd30
Jan 7 20:58:47 einstein4 kernel: [14029681.125247] ffffffffa0234aed
ffff88041877c380 0000000000000000 000000000000000c
Jan 7 20:58:47 einstein4 kernel: [14029681.125514] 0000000000000000
0000000100000000 0ff0b98800000000 0000000100000002
Jan 7 20:58:47 einstein4 kernel: [14029681.125783] Call Trace:
Jan 7 20:58:47 einstein4 kernel: [14029681.125856]
[<ffffffff814e9555>] ? dump_stack+0x41/0x51
Jan 7 20:58:47 einstein4 kernel: [14029681.125937]
[<ffffffffa0234aed>] ? xfs_alloc_ag_vextent_size+0x33d/0x650 [xfs]
Jan 7 20:58:47 einstein4 kernel: [14029681.126035]
[<ffffffffa023594d>] ? xfs_alloc_ag_vextent+0x9d/0x100 [xfs]
Jan 7 20:58:47 einstein4 kernel: [14029681.126118]
[<ffffffffa02366c5>] ? xfs_alloc_vextent+0x485/0x610 [xfs]
Jan 7 20:58:47 einstein4 kernel: [14029681.126202]
[<ffffffffa0245ff4>] ? xfs_bmap_btalloc+0x2d4/0x790 [xfs]
Jan 7 20:58:47 einstein4 kernel: [14029681.126286]
[<ffffffffa0246ab1>] ? __xfs_bmapi_allocate+0xd1/0x2e0 [xfs]
Jan 7 20:58:47 einstein4 kernel: [14029681.126366]
[<ffffffffa0216375>] ? xfs_bmapi_allocate_worker+0x35/0x70 [xfs]
Jan 7 20:58:47 einstein4 kernel: [14029681.126456]
[<ffffffff8107d5db>] ? process_one_work+0x16b/0x400
Jan 7 20:58:47 einstein4 kernel: [14029681.126529]
[<ffffffff8107e9c6>] ? worker_thread+0x116/0x380
Jan 7 20:58:47 einstein4 kernel: [14029681.126601]
[<ffffffff8107e8b0>] ? manage_workers.isra.21+0x290/0x290
Jan 7 20:58:47 einstein4 kernel: [14029681.126675]
[<ffffffff81084bcc>] ? kthread+0xbc/0xe0
Jan 7 20:58:47 einstein4 kernel: [14029681.126745]
[<ffffffff81084b10>] ? flush_kthread_worker+0x80/0x80
Jan 7 20:58:47 einstein4 kernel: [14029681.126819]
[<ffffffff814f724c>] ? ret_from_fork+0x7c/0xb0
Jan 7 20:58:47 einstein4 kernel: [14029681.126890]
[<ffffffff81084b10>] ? flush_kthread_worker+0x80/0x80
Jan 7 20:58:47 einstein4 kernel: [14029681.126973] XFS (sdb1): page
discard on page ffffea000b973650, inode 0x8e56233bd, offset 172167168.
[2] multiple times until we had enough RAM installed into the machine
(24GB+swap space finally made the difference)
[3] e.g.
[170369.029932] ------------[ cut here ]------------
[170369.029940] WARNING: CPU: 4 PID: 65 at fs/xfs/xfs_aops.c:1184
xfs_vm_releasepage+0xb5/0xc0 [xfs]()
[170369.029942] Modules linked in: dm_mod nfsv3 tun xt_nat autofs4
ipt_REJECT ipt_MASQUERADE xt_state iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack iptable_filter ip_tables
xt_tcpudp x_tables nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
fscache sunrpc ext4 crc16 mbcache jbd2 ipmi_poweroff ipmi_watchdog
ipmi_devintf ipmi_si ipmi_msghandler iTCO_wdt iTCO_vendor_support ttm
drm_kms_helper drm coretemp i5000_edac edac_core lpc_ich i5k_amb
mfd_core ioatdma sr_mod cdrom i2c_algo_bit i2c_i801 rng_core i2c_core
joydev dca psmouse ehci_pci snd_pcm snd_timer snd soundcore pcspkr
serio_raw shpchp processor parport_pc parport evdev kvm thermal_sys
button xfs crc32c libcrc32c md_mod usb_storage hid_generic usbhid hid
sd_mod sg crc_t10dif crct10dif_common floppy ahci libahci libata
uhci_hcd e1000 ehci_hcd arcmsr usbcore usb_common scsi_mod e1000e ptp
pps_core
[170369.029986] CPU: 4 PID: 65 Comm: kswapd0 Tainted: G W
3.14.24-atlas #1
[170369.029986] Hardware name: Supermicro X7DVL/X7DVL, BIOS 6.00 11/02/2007
[170369.029987] 0000000000000009 ffff88060d461a18 ffffffff81529af8
0000000000000000
[170369.029989] 0000000000000000 ffff88060d461a58 ffffffff8106a37c
ffff880610389d40
[170369.029991] ffff88060d461b28 ffffea000f39f8a0 ffff8805f722dd98
ffffea000f39f8c0
[170369.029994] Call Trace:
[170369.029994] [<ffffffff81529af8>] dump_stack+0x46/0x58
[170369.029996] [<ffffffff8106a37c>] warn_slowpath_common+0x8c/0xc0
[170369.029998] [<ffffffff8106a3ca>] warn_slowpath_null+0x1a/0x20
[170369.030000] [<ffffffffa0212225>] xfs_vm_releasepage+0xb5/0xc0 [xfs]
[170369.030009] [<ffffffff81182a10>] ? anon_vma_prepare+0x140/0x140
[170369.030011] [<ffffffff81149902>] try_to_release_page+0x32/0x50
[170369.030014] [<ffffffff8115ed7a>] shrink_active_list+0x3aa/0x3b0
[170369.030016] [<ffffffff8115f154>] shrink_lruvec+0x3d4/0x6a0
[170369.030017] [<ffffffff811aa001>] ?
__mem_cgroup_largest_soft_limit_node+0x21/0xd0
[170369.030019] [<ffffffff811aa001>] ?
__mem_cgroup_largest_soft_limit_node+0x21/0xd0
[170369.030021] [<ffffffff8115f486>] shrink_zone+0x66/0x1a0
[170369.030023] [<ffffffff8116065c>] balance_pgdat+0x3ac/0x600
[170369.030025] [<ffffffff81160a34>] kswapd+0x184/0x470
[170369.030027] [<ffffffff810ae460>] ? finish_wait+0x80/0x80
[170369.030028] [<ffffffff811608b0>] ? balance_pgdat+0x600/0x600
[170369.030030] [<ffffffff8108da79>] kthread+0xc9/0xe0
[170369.030032] [<ffffffff8108d9b0>] ? flush_kthread_worker+0x80/0x80
[170369.030034] [<ffffffff81538cfc>] ret_from_fork+0x7c/0xb0
[170369.030035] [<ffffffff8108d9b0>] ? flush_kthread_worker+0x80/0x80
[170369.030038] ---[ end trace 9f4d705dbe4e039c ]---
--
Dr. Carsten Aulbert - Max Planck Institute for Gravitational Physics
Callinstrasse 38, 30167 Hannover, Germany
phone/fax: +49 511 762-17185 / -17193
https://wiki.atlas.aei.uni-hannover.de/foswiki/bin/view/ATLAS/WebHome
|