very slow file deletion on an SSD
Joe Landman
joe.landman at gmail.com
Sat May 26 18:25:55 CDT 2012
On 05/26/2012 07:18 PM, Dave Chinner wrote:
> On Fri, May 25, 2012 at 06:37:05AM -0400, Joe Landman wrote:
>> Hi folks:
>>
>> Just ran into this (see posted output at bottom). 3.2.14 kernel,
>> MD RAID 5, xfs file system. Not sure (precisely) where the problem
>> is, hence posting to both lists.
>>
>> [root at siFlash ~]# cat /proc/mdstat
>> Personalities : [raid1] [raid6] [raid5] [raid4]
>> md22 : active raid5 sdl[0] sds[7] sdx[6] sdu[5] sdk[4] sdz[3] sdw[2] sdr[1]
>> 1641009216 blocks super 1.2 level 5, 32k chunk, algorithm 2
>> [8/8] [UUUUUUUU]
>>
>> md20 : active raid5 sdh[0] sdf[7] sdm[6] sdd[5] sdc[4] sde[3] sdi[2] sdg[1]
>> 1641009216 blocks super 1.2 level 5, 32k chunk, algorithm 2
>> [8/8] [UUUUUUUU]
>>
>> md21 : active raid5 sdy[0] sdq[7] sdp[6] sdo[5] sdn[4] sdj[3] sdv[2] sdt[1]
>> 1641009216 blocks super 1.2 level 5, 32k chunk, algorithm 2
>> [8/8] [UUUUUUUU]
>>
>> md0 : active raid1 sdb1[1] sda1[0]
>> 93775800 blocks super 1.0 [2/2] [UU]
>> bitmap: 1/1 pages [4KB], 65536KB chunk
>>
>>
>> md2* are SSD RAID5 arrays we are experimenting with. Xfs file
>> systems atop them:
>>
>> [root at siFlash ~]# mount | grep md2
>> /dev/md20 on /data/1 type xfs (rw)
>> /dev/md21 on /data/2 type xfs (rw)
>> /dev/md22 on /data/3 type xfs (rw)
>>
>> vanilla mount options (following Dave Chinner's long standing advice)
>>
>> meta-data=/dev/md20 isize=2048 agcount=32,
>> agsize=12820392 blks
>> = sectsz=512 attr=2
>> data = bsize=4096 blocks=410252304, imaxpct=5
>> = sunit=8 swidth=56 blks
>> naming =version 2 bsize=65536 ascii-ci=0
>> log =internal bsize=4096 blocks=30720, version=2
>> = sectsz=512 sunit=8 blks, lazy-count=1
>> realtime =none extsz=4096 blocks=0, rtextents=0
>
> But you haven't followed my advice when it comes to using default
> mkfs options, have you? You're running 2k inodes and 64k directory
> block size, which is not exactly a common config
We were experimenting. Easy to set it back and demonstrate the problem
again.
>
> The question is, why do you have these options configured, and are
> they responsible for things being slow?
>
We saw it before we experimented with some mkfs options. Will rebuild
FS and demo it again.
>> All this said, deletes from this unit are taking 1-2 seconds per file ...
>
> Sounds like you might be hitting the synchronous xattr removal
> problem that was recently fixed (as has been mentioned already), but
> even so 2 IOs don't take 1-2s to do, unless the MD RAID5 barrier
> implementation is really that bad. If you mount -o nobarrier, what
> happens?
[root at siFlash test]# ls -alF | wc -l
59
[root at siFlash test]# /usr/bin/time rm -f *
^C0.00user 8.46system 0:09.55elapsed 88%CPU (0avgtext+0avgdata
2384maxresident)k
25352inputs+0outputs (0major+179minor)pagefaults 0swaps
[root at siFlash test]# ls -alF | wc -l
48
Nope, still an issue:
1338074901.531554 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost
isig icanon echo ...}) = 0 <0.000021>
1338074901.531701 newfstatat(AT_FDCWD, "1.r.12.0",
{st_mode=S_IFREG|0600, st_size=1073741824, ...}, AT_SYMLINK_NOFOLLOW) =
0 <0.000022>
1338074901.531840 unlinkat(AT_FDCWD, "1.r.12.0", 0) = 0 <2.586999>
1338074904.119032 newfstatat(AT_FDCWD, "1.r.13.0",
{st_mode=S_IFREG|0600, st_size=1073741824, ...}, AT_SYMLINK_NOFOLLOW) =
0 <0.000033>
2.6 seconds for an unlink.
Rebuilding absolutely vanilla file system now, and will rerun checks.
>
> CHeers,
>
> Dave.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the xfs
mailing list