<div dir="ltr"><div class="gmail_quote"><div dir="ltr">Hi Stan,<br><div class="gmail_extra"><br><br><div class="gmail_quote"><div class="im">On Thu, Dec 5, 2013 at 12:10 AM, Stan Hoeppner <span dir="ltr"><<a href="mailto:stan@hardwarefreak.com" target="_blank">stan@hardwarefreak.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">On 12/4/2013 8:55 PM, Mike Dacre wrote:<br>
...<br>
> I have a 16 2TB drive RAID6 array powered by an LSI 9240-4i. It has an XFS.<br>
<br>
It's a 9260-4i, not a 9240, a huge difference. I went digging through<br>
your dmesg output because I knew the 9240 doesn't support RAID6. A few<br>
questions. What is the LSI RAID configuration?<br></blockquote><div> </div></div><div>You are right, sorry. 9260-4i</div><div class="im"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
1. Level -- confirm RAID6<br></blockquote></div><div>Definitely RAID6 </div><div class="im"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
2. Strip size? (eg 512KB)<br></blockquote></div><div>64KB </div><div class="im"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
3. Stripe size? (eg 7168KB, 14*256)<br></blockquote></div><div>Not sure how to get this </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
4. BBU module?<br></blockquote><div>Yes. iBBU, state optimal, 97% charged. </div><div class="im"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
5. Is write cache enabled?<br>
<br></blockquote></div><div>Yes: Cahced IO and Write Back with BBU are enabled.</div><div><br></div><div>I have also attached an adapter summary (megaraid_adp_info.txt) and a virtual and physical drive summary (megaraid_drive_info.txt). </div>
<div class="im">
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
What is the XFS geometry?<br>
<br>
5. xfs_info /dev/sda<br></blockquote></div><div><div><br></div><div>`xfs_info /dev/sda1`</div><div>meta-data =/dev/sda1 isize=256 agcount=26, agsize=268435455 blks</div><div> = sectsz=512 attr=2</div>
<div>data = bsize=4096 blocks=6835404288, imaxpct=5</div><div> = sunit=0 swidth=0 blks</div><div>naming =version 2 bsize=4096 ascii-ci=0</div>
<div>log =internal bsize=4096 blocks=521728, version=2</div><div> = sectsz=512 sunit=0 blks, lazy-count=1</div><div>realtime =none extsz=4096 blocks=0, rtextents=0</div>
</div><div><br></div><div>This is also attached as xfs_info.txt </div><div class="im"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
A combination of these these being wrong could very well be part of your<br>
problems.<br>
<br>
...<br>
<div>> IO errors when any requests were made. This happened while it was being<br>
<br>
</div>I didn't see any IO errors in your dmesg output. None.<br>
<div><br></div></blockquote></div><div>Good point. These happened while trying to ls. I am not sure why I can't find them in the log, they printed out to the console as 'Input/Output' errors, simply stating that the ls command failed.</div>
<div class="im">
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div>
> accessed by 5 different users, one was doing a very large rm operation (rm<br>
> *sh on thousands on files in a directory). Also, about 30 minutes before<br>
> we had connected the globus connect endpoint to allow easy file transfers<br>
> to SDSC.<br>
<br>
</div>With delaylog enabled, which I believe it is in RHEL/CentOS 6, a single<br>
big rm shouldn't kill the disks. But with the combination of other<br>
workloads it seems you may have been seeking the disks to death.<br></blockquote></div><div>That is possible, workloads can get really high sometimes. I am not sure how to control that without significantly impacting performance - I want a single user to be able to use 98% IO capacity sometimes... but other times I want the load to be split amongst many users. Also, each user can execute jobs simultaneously on 23 different computers, each acessing the same drive via NFS. This is a great system most of the time, but sometimes the workloads on the drive get really high. </div>
<div class="im">
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
...<br>
<div>> In the end, I successfully repaired the filesystem with `xfs_repair -L<br>
> /dev/sda1`. However, I am nervous that some files may have been corrupted.<br>
<br>
</div>I'm sure your users will let you know. I'd definitely have a look in<br>
the directory that was targeted by the big rm operation which apparently<br>
didn't finish when XFS shutdown.<br>
<div><br>
> Do any of you have any idea what could have caused this problem?<br>
<br>
</div>Yes. A few things. The first is this, and it's a big one:<br>
<br>
Dec 4 18:15:28 fruster kernel: io scheduler noop registered<br>
Dec 4 18:15:28 fruster kernel: io scheduler anticipatory registered<br>
Dec 4 18:15:28 fruster kernel: io scheduler deadline registered<br>
Dec 4 18:15:28 fruster kernel: io scheduler cfq registered (default)<br>
<br>
<a href="http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E" target="_blank">http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E</a><br>
<br>
"As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much<br>
of the parallelization in XFS."<br>
<br>
*Never* use the CFQ elevator with XFS, and never with a high performance<br>
storage system. In fact, IMHO, never use CFQ period. It was horrible<br>
even before 3.2.12. It is certain that CFQ is playing a big part in<br>
your 120s timeouts, though it may not be solely responsible for your IO<br>
bottleneck. Switch to deadline or noop immediately, deadline if LSI<br>
write cache is disabled, noop if it is enabled. Execute this manually<br>
now, and add it to a startup script and verify it is being set at<br>
startup, as it's not permanent:<br>
<br>
echo deadline > /sys/block/sda/queue/scheduler<br>
<br></blockquote></div><div>Wow, this is huge, I can't believe I missed that. I have switched it to noop now as we use write caching. I have been trying to figure out for a while why I would keep getting timeouts when the NFS load was high. If you have any other suggestions for how I can improve performance, I would greatly appreciate it.</div>
<div class="im">
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
This one simple command line may help pretty dramatically, immediately,<br>
assuming your hardware array parameters aren't horribly wrong for your<br>
workloads, and your XFS alignment correctly matches the hardware geometry.<br>
<span><font color="#888888"><br></font></span></blockquote></div><div>Great, thanks. Our workloads vary considerably as we are a biology research lab, sometimes we do lots of seeks, other times we are almost maxing out read or write speed with massively parallel processes all accessing the disk at the same time.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span><font color="#888888">
--<br>
Stan<br>
<br>
<br></font></span></blockquote><span class="HOEnZb"><font color="#888888"><div><br></div><div>-Mike </div></font></span></div><br></div></div>
</div><br></div>