|
Hi, About 2 months ago, I asked one problem in XFS, see here(http://oss.sgi.com/archives/xfs/2015-02/msg00197.html).
After that, I use direct IO in MySQL, see here(https://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_flush_method).â
However, I found that MySQL performance is still poor sometimes. I use some tools(https://github.com/brendangregg/perf-toolsâ) to trace the kernel, I found some problems:
# ./funccount -i 1 "xfs_f*" Tracing "xfs_f*"... Ctrl-C to end. FUNC COUNT xfs_file_aio_read 15591 xfs_flushinval_pages 15591 xfs_find_bdev_for_inode 31182â
As we can see, xfs_file_aio_read each will call âxfs_flushinval_pages. Note that I used direct IO!!!â
xfs_flushinval_pages will call truncate_inode_pages_range, from here(https://bitbucket.org/hustcat/kernel-2.6.32/src/0e5d90ed6f3ef8a3b5fe62a04cc6766a721c70f8/fs/xfs/linux-2.6/xfs_fs_subr.c?at=master#cl-56â)ââ
Indeed that, # ./funccount -i 1 "truncate_inode_page*" Tracing "truncate_inode_page*"... Ctrl-C to end. FUNC COUNT truncate_inode_page 4 truncate_inode_pages 176 truncate_inode_pages_range 15474 FUNC COUNT truncate_inode_page 1 truncate_inode_pages 5 truncate_inode_pages_range 15566â
As we can see, truncate_inode_pages_range called times as many as âxfs_flushinval_pages,â However, I found that truncate_inode_pages_range didn't call âtruncate_inode_page:â
# ./funcgraph truncate_inode_pages_range Tracing "truncate_inode_pages_range"... Ctrl-C to end. 2) 1.020 us | finish_task_switch(); 2) | truncate_inode_pages_range() { 2) | pagevec_lookup() { 2) 0.413 us | find_get_pages(); 2) 1.033 us | } 2) 0.238 us | _cond_resched(); 2) | pagevec_lookup() { 2) 0.234 us | find_get_pages(); 2) 0.690 us | } 2) 3.362 us | } 2) | truncate_inode_pages_range() { 2) | pagevec_lookup() { 2) 0.266 us | find_get_pages(); 2) 0.745 us | } 2) 0.238 us | _cond_resched(); 2) | pagevec_lookup() { 2) 0.248 us | find_get_pages(); 2) 0.701 us | } 2) 2.844 us | } 2) | truncate_inode_pages_range() { 2) | pagevec_lookup() { 2) 0.262 us | find_get_pages(); 2) 0.740 us | } 2) 0.238 us | _cond_resched(); 2) | pagevec_lookup() { 2) 0.251 us | find_get_pages(); 2) 0.705 us | } 2) 2.767 us | }â
This will canse inode->i_mapping->nrpages > 0 always, and xfs_file_aio_read/xfs_file_dio_aio_write will always call xfs_flushinval_pages. Even worse, xfs_file_dio_aio_write will use EXCL lock:âââ
if (mapping->nrpages && iolock == XFS_IOLOCK_SHARED) {
xfs_rw_iunlock(ip, iolock);
iolock = XFS_IOLOCK_EXCL;
xfs_rw_ilock(ip, iolock);
}
see here(https://bitbucket.org/hustcat/kernel-2.6.32/src/0e5d90ed6f3ef8a3b5fe62a04cc6766a721c70f8/fs/xfs/linux-2.6/xfs_file.c?at=master#cl-659â).
This will cause bad performance, even direct IO. I still don't understand why not truncate_inode_page called?â
Every time, after I run this: echo 1 > /proc/sys/vm/drop_caches
Immediately enhance performance.
Ye, Thanks.
|