mkfs.xfs fails with raid5 and smaller chunk sizes
Brian Hemme
bmh at rincon.com
Tue Sep 16 17:47:43 CDT 2014
On 09/16/2014 03:17 PM, Dave Chinner wrote:
> On Tue, Sep 16, 2014 at 03:03:08PM -0700, Brian Hemme wrote:
>> Hello all,
>>
>> I am having some odd problems with mkfs.xfs when used on a raid 5
>> array. The array is built from 6 960GB SSDs all connected to SATA
>> ports on the MB and created with mdadm. If I use a chunk size any
>> smaller then 512K mkfs.xfs just hangs forever. It continues to use
>> CPU and so does the raid array but never completes. If the system
>> is just left running for an extended length of time the whole OS
>> eventually locks up. I have tried this on three different systems
>> with the same results. I have searched all over for someone with
>> similar issues without success. I am hoping I am just doing
>> something clearly wrong and you all can set me straight quickly.
>>
>> Some specifics:
>> Arch linux with 3.14.1 kernel
>> mkfs.xfs version 3.1.11
>> mdadm - v3.3 - 3rd September 2013
>>
>> Commands:
>>> mdadm --create /dev/md0 --chunk=64K --level=5 --raid-devices=6
>> /dev/sd[a-f]
>>> mkfs.xfs /dev/md0
>> ** This command fails and locks up
>>
>> I have tried specifying the arguments to mkfs.xfs with the same
>> results. Building a 4 drive array seems to require a chunk size of
>> 1M or greater to work. Same results if I make a partition on the
>> array and make the fs there.
> mkfs.xfs really should only take a couple of seconds to complete.
> Seeing as you are using SSDs, my first suspicion is that md or the
> SSDs are having problems with discard. Hence you should first
> try 'mkfs.xfs -K /dev/md0' and see if that completes quickly.
>
> Otherwise, output of 'echo w> sysrq-trigger' from dmesg would be a
> good start, as would a 'perf top -G -U' snapshot (run for 30s at
> least a minute after mkfs.xfs starts) to tell us what is burning
> CPU.
>
> Cheers,
>
> Dave.
Thanks for the quick response!
Adding the -K seemed to do the trick. However, for my education, why is
this needed in this case? It seems to work without it for larger chunk
sizes or for raid 0 instead of 5. It also worked on our old install
with a 3.1.6 kernel. Any why would not using the -K cause enough of a
problem that the whole machine hangs? Just trying to understand this
enough to make sure I don't run into problems down the road.
Thanks again,
Brian
More information about the xfs
mailing list