| To: | "Dave Chinner" <david@xxxxxxxxxxxxx> |
|---|---|
| Subject: | ååï ååï XFS direct IO problem |
| From: | "YeYin" <eyniy@xxxxxx> |
| Date: | Thu, 9 Apr 2015 11:48:10 +0800 |
| Cc: | "xfs" <xfs@xxxxxxxxxxx> |
| Delivered-to: | xfs@xxxxxxxxxxx |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201307; t=1428551291; bh=/71A9bvGKZtLTn23epXFVGuS5zfBlPsofyq4QyOmBKw=; h=X-QQ-FEAT:X-QQ-SSF:X-HAS-ATTACH:X-QQ-BUSINESS-ORIGIN: X-Originating-IP:In-Reply-To:References:X-QQ-STYLE:X-QQ-mid:From:To:Cc:Subject:Mime-Version:Content-Type:Content-Transfer-Encoding:Date: X-Priority:Message-ID:X-QQ-MIME:X-Mailer:X-QQ-Mailer: X-QQ-ReplyHash:X-QQ-SENDSIZE; b=fGimXO33W8lhoN8N5soMUZCFh9S6SnRsNJBNUmTdmbPLay1yp0v2tmCVVyVcNQq6b 6w6LaNM3J0qu7U5hKiPcwBAnnpf9qVd9O03sUCpVmehnuDrXiO9TsRqhQzDiu6f4Ee kauEKj8Z1kLjXOOxWxds2iwB4WM1kA23b5KWpHAs= |
| In-reply-to: | <tencent_4C7213C73B62CD3477B4AC31@xxxxxx> |
| References: | <tencent_316A3DE769544D99766FE3F1@xxxxxx> <20150408044955.GE15810@dastard> <tencent_60C0CC90244648E22E374DF9@xxxxxx> <20150408211436.GF15810@dastard> <tencent_4C7213C73B62CD3477B4AC31@xxxxxx> |
|
I have reported this problem to MySQL. See here: http://bugs.mysql.com/bug.php?id=76627 Thanks, Ye ------------------ ååéä ------------------ åää: "YeYin";<eyniy@xxxxxx>; åéæé: 2015å4æ9æ(ææå) äå10:37 æää: "Dave Chinner"<david@xxxxxxxxxxxxx>; æé: "xfs"<xfs@xxxxxxxxxxx>; äé: ååï ååï XFS direct IO problem I traced MySQL: [pid 13478] open("./test/big_tb.ibd", O_RDONLY) = 37 [pid 13478] pread(37, "W\346\203@\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\v\37c\225\263\0\10\0\0\0\0\0\0"..., 16384, 0) = 16384 [pid 13478] close(37) = 0 [pid 13478] open("./test/big_tb.ibd", O_RDWR) = 37 [pid 13478] fcntl(37, F_SETFL, O_RDONLY|O_DIRECT) = 0 [pid 13478] fcntl(37, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0 [pid 13478] pread(37, "\350\301\270\271\0\0\0\3\377\377\377\377\377\377\377\377\0\0\0\v\37c\225\263E\277\0\0\0\0\0\0"..., 16384, 49152) = 16384 [pid 13478] pread(37, "e\251|m\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\31\245\243\0\5\0\0\0\0\0\0"..., 16384, 16384) = 16384 As we can see, MySQL will open data file twice when open table. And the first open file without O_DIRECT flag will generate page cache. I traced kernel: Tracing "sys_pread64"... Ctrl-C to end. 3) | sys_pread64() { 3) 0.362 us | fget_light(); 3) | vfs_read() { 3) | rw_verify_area() { 3) | security_file_permission() { 3) 0.251 us | cap_file_permission(); 3) 0.817 us | } 3) 1.377 us | } 3) | do_sync_read() { 3) | xfs_file_aio_read() { 3) 0.259 us | generic_segment_checks(); 3) | xfs_rw_ilock() { 3) | xfs_ilock() { 3) | down_read() { 3) 0.233 us | _cond_resched(); 3) 0.713 us | } 3) 1.433 us | } 3) 2.097 us | } 3) | generic_file_aio_read() { 3) 0.229 us | generic_segment_checks(); 3) 0.227 us | _cond_resched(); 3) 0.261 us | find_get_page(); 3) | page_cache_sync_readahead() { 3) | ondemand_readahead() { ... I run MySQL 5.5.24 on CentOS6.5, with kernel 2.6.32-431. Later I will use the newer kernel to test it. Thanks, Ye ------------------ ååéä ------------------ åää: "Dave Chinner";<david@xxxxxxxxxxxxx>; åéæé: 2015å4æ9æ(ææå) åæ5:14 æää: "YeYin"<eyniy@xxxxxx>; æé: "xfs"<xfs@xxxxxxxxxxx>; äé: Re: ååï XFS direct IO problem > Dave, > Thank you for your explanation. I got the reason, and I write some code to simulate the MySQL.It will reproduce the progress:â > > > open file without direct flag > read file //cause kernel readahead 4 pages, and inode->i_mapping->nrpages > 0 > close file > > > open file with direct flag > lseek 4*4096 // skip 4 readahead pages > read file //cause xfs_flushinval_pages to do nothing > ... > Yes, you can cause it that way, but any application mixing buffered IO and direct IO like that is broken. I'll point you at the open(2) man page, in the section about O_DIRECT: "Applications should avoid mixing O_DIRECT and normal I/O to the same file, and especially to overlapping byte regions in the same file. Even when the filesystem correctly handles the coherency issues in this situation, overall I/O throughput is likely to be slower than using either mode alone. Likewise, applications should avoid mixing mmap(2) of files with direct I/O to the same files." IOWs, your test program is behaving as documented for a program that mixes buffered and direct IO.... AFAIK, MySQL does not do mixed buffer/direct IO like this and so this is extremely unlikely to be the source of the problem. I need to understand how MySQL is generating cached pages on it's database files when it is supposed to be using direct IO, and the reproducer program needs to do what MySQL does to generate cached pages. Can you please find the location of the cached pages (as I sugggested via tracing in my last email) in the MySQL files that are causing the problem? > I'd like to ask XFS how to resovle this problem? Applications that need to mix buffered and direct IO can invalidate the cached pages by using POSIX_FADV_DONTNEED before doing direct IO. FWIW, You must be looking at quite old kernel code if xfs_flushinval_pages() exists in your kernel. Does MySQL on a current upstream kernel have the same problem? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | ååï ååï XFS direct IO problem, YeYin |
|---|---|
| Next by Date: | Re: [PATCH] xfs_repair: junk last entry in sf dir if name starts beyond dir size, Rui Gomes |
| Previous by Thread: | ååï ååï XFS direct IO problem, YeYin |
| Next by Thread: | ノーコストの自然被リンクで、簡単・安全に SEO 対策ができるサイトを作りました。個人事業主様や中小企業様のサイトもたくさん上位表示されてます。, "佐藤 春菜" |
| Indexes: | [Date] [Thread] [Top] [All Lists] |