|Subject:||concurrent direct IO write in xfs|
|From:||Zheng Da <zhengda1936@xxxxxxxxx>|
|Date:||Sun, 15 Jan 2012 19:01:42 -0500|
|Dkim-signature:||v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=ScOizV/JVDoeAJ9oBROmDv6IUISu7WSwRh7d35yjvWk=; b=dpb+5FSB5c5o9cmqYAu0g+E2uXtSSMWk6/7X8F6mqFLStLoTIJLGHNjFZBmoU4oZ8v 5Aihn6g1K/hVRKAMEXYi4GWfd933DBt7f9q/YhG5ReAl0JbZw7qtwD6i0nI7d5tnJH/9 MmJo32G7421iFPGVnfAnWegkDHSk7++CrkAqU=|
I surprisedly found that writing data to a file (no appending) with direct IO and with multiple threads has the same performance as a single thread. Actually, it seems there is only one core is working at a time. In my case, each time I write a page to a file and the offset is always aligned to the page size, so there is no overlapping between writes.
According to lockstat, the lock that causes the most waiting time is xfs_inode.i_lock.
&(&ip->i_lock)->mr_lock-W: 31568 36170 0.24 20048.25 7589157.99 130154 3146848 0.00 217.70 1238310.72
&(&ip->i_lock)->mr_lock-R: 11251 11886 0.24 20043.01 2895595.18 46671 526309 0.00 63.80 264097.96
&(&ip->i_lock)->mr_lock 36170 [<ffffffffa03be122>] xfs_ilock+0xb2/0x110 [xfs]
&(&ip->i_lock)->mr_lock 11886 [<ffffffffa03be15a>] xfs_ilock+0xea/0x110 [xfs]
&(&ip->i_lock)->mr_lock 38555 [<ffffffffa03be122>] xfs_ilock+0xb2/0x110 [xfs]
&(&ip->i_lock)->mr_lock 9501 [<ffffffffa03be15a>] xfs_ilock+0xea/0x110 [xfs]
And systemtap shows me that xfs_inode.i_lock is locked exclusively in the following functions.
0xffffffff81289235 : xfs_file_aio_write_checks+0x45/0x1d0 [kernel]
0xffffffff812829f4 : __xfs_get_blocks+0x94/0x4a0 [kernel]
0xffffffff81288b6a : xfs_aio_write_newsize_update+0x3a/0x90 [kernel]
0xffffffff8129590a : xfs_log_dirty_inode+0x7a/0xe0 [kernel]
xfs_log_dirty_inode is only invoked 3 times when I write 4G data to the file, so we can completely ignore it. But I'm not sure which of them is the major cause of the bad write performance or whether they are the cause of the bad performance. But it seems none of them are the main operations in direct io write.
It seems to me that the lock might not be necessary for my case. It'll be nice if I can disable the lock. Or is there any suggestion of achieving better write performance with multiple threads in XFS?
I tried ext4 and it doesn't perform better than XFS. Does the problem exist in all FS?
|<Prev in Thread]||Current Thread||[Next in Thread>|