I am looking for more details on the “-n size=65536” option in mkfs.xfs. The question is the memory allocation this option generates. The system is Redhat EL 7.0 (3.10.0-229.1.2.el7.x86_64).
We have been getting this memory allocation deadlock message in the /var/log/messages file. The file system is used for ceph OSD and it has about 531894 files.
Oct 6 07:11:09 abc-ceph1-xyz kernel: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250)
Reading this post,
https://access.redhat.com/solutions/532663 it discussed the “memory allocation deadlock” message. It looks like the root cause is a very fragmented file. Is setting the “-n size=65536” option on the file system causes (directly or indirectly) the “memory
allocation deadlock” error?
Here is a note on the “-n size=65536”. The FAQ note is more on the CPU and IO performance.
http://xfs.org/index.php/XFS_FAQ
“
Q: Performance: mkfs.xfs -n size=64k option
Asking the implications of that mkfs option on the XFS mailing list,
Dave Chinner explained it this way:
Inodes are not stored in the directory structure, only the directory entry name
and the inode number. Hence the amount of space used by a directory entry is
determined by the length of the name.
There is extra overhead to allocate large directory blocks (16 pages instead of
one, to begin with, then there's the vmap overhead, etc), so for small
directories smaller block sizes are faster for create and unlink operations.
For empty directories, operations on 4k block sized directories consume roughly
50% less CPU that 64k block size directories. The 4k block size directories
consume less CPU out to roughly 1.5 million entries where the two are roughly
equal. At directory sizes of 10 million entries, 64k directory block operations
are consuming about 15% of the CPU that 4k directory block operations consume.
In terms of lookups, the 64k block directory will take less IO but consume more
CPU for a given lookup. Hence it depends on your IO latency and whether
directory readahead can hide that latency as to which will be faster. e.g. For
SSDs, CPU usage might be the limiting factor, not the IO. Right now I don't
have any numbers on what the difference might be - I'm getting 1 billion inode
population issues worked out first before I start on measuring cold cache
lookup times on 1 billion files....
“
Thanks,
Al Lau