<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
<pre class="moz-signature" cols="72">Dave Hall
Binghamton University
<a class="moz-txt-link-abbreviated" href="mailto:kdhall@binghamton.edu">kdhall@binghamton.edu</a>
607-760-2328 (Cell)
607-777-4641 (Office)</pre>
<br>
On 03/14/2013 08:55 AM, Stan Hoeppner wrote:
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap="">Yes, please provide the output of the following commands:
</pre>
</blockquote>
<pre wrap="">~$ uname -a
</pre>
</blockquote>
<tt>Linux decoy 3.2.0-0.bpo.4-amd64 #1 SMP Debian 3.2.35-2~bpo60+1
x86_64 GNU/Linux</tt><br>
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>~$ grep xfs /etc/fstab
</pre>
</blockquote>
</blockquote>
<tt>LABEL=backup /infortrend xfs
inode64,noatime,nodiratime,nobarrier 0 0<br>
(cat /proc/mounts: /dev/sdb1 /infortrend xfs
rw,noatime,nodiratime,attr2,delaylog,nobarrier,inode64,noquota 0 0)</tt><br>
<br>
Note that there is also a second XFS on a separate 3ware raid card, but
the I/O traffic on that one is fairly low. It is used as a staging
area for a Debian mirror that is hosted on another server.<br>
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>~$ xfs_info <i
class="moz-txt-slash"><span class="moz-txt-tag">/</span>dev<span
class="moz-txt-tag">/</span></i>[mount-point]
</pre>
</blockquote>
</blockquote>
<tt># xfs_info /dev/sdb1<br>
meta-data=/dev/sdb1 isize=256 agcount=26,
agsize=268435455 blks<br>
= sectsz=512 attr=2<br>
data = bsize=4096 blocks=6836364800,
imaxpct=5<br>
= sunit=0 swidth=0 blks<br>
naming =version 2 bsize=4096 ascii-ci=0<br>
log =internal bsize=4096 blocks=521728, version=2<br>
= sectsz=512 sunit=0 blks, lazy-count=1<br>
realtime =none extsz=4096 blocks=0, rtextents=0</tt><br>
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>~$ df <i
class="moz-txt-slash"><span class="moz-txt-tag">/</span>dev<span
class="moz-txt-tag">/</span></i>[mount_point]
</pre>
</blockquote>
</blockquote>
<tt># df /dev/sdb1<br>
Filesystem 1K-blocks Used Available Use% Mounted on</tt><br>
/dev/sdb1 27343372288 20432618356 6910753932 75% /infortrend<br>
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>~$ df -i <i
class="moz-txt-slash"><span class="moz-txt-tag">/</span>dev<span
class="moz-txt-tag">/</span></i>[mount_point]
</pre>
</blockquote>
</blockquote>
<tt># df -i /dev/sdb1<br>
Filesystem Inodes IUsed IFree IUse% Mounted on<br>
/dev/sdb1 5469091840 1367746380 4101345460 26% /infortrend</tt><br>
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>~$ xfs_db -r -c freesp <i
class="moz-txt-slash"><span class="moz-txt-tag">/</span>dev<span
class="moz-txt-tag">/</span></i>[mount-point]
</pre>
</blockquote>
</blockquote>
<tt># xfs_db -r -c freesp /dev/sdb1<br>
from to extents blocks pct<br>
1 1 832735 832735 0.05<br>
2 3 432183 1037663 0.06<br>
4 7 365573 1903965 0.11<br>
8 15 352402 3891608 0.23<br>
16 31 332762 7460486 0.43<br>
32 63 300571 13597941 0.79<br>
64 127 233778 20900655 1.21<br>
128 255 152003 27448751 1.59<br>
256 511 112673 40941665 2.37<br>
512 1023 82262 59331126 3.43<br>
1024 2047 53238 76543454 4.43<br>
2048 4095 34092 97842752 5.66<br>
4096 8191 22743 129915842 7.52<br>
8192 16383 14453 162422155 9.40<br>
16384 32767 8501 190601554 11.03<br>
32768 65535 4695 210822119 12.20<br>
65536 131071 2615 234787546 13.59<br>
131072 262143 1354 237684818 13.76<br>
262144 524287 470 160228724 9.27<br>
524288 1048575 74 47384798 2.74<br>
1048576 2097151 1 2097122 0.12</tt><br>
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>
<span class="moz-txt-citetags">> </span>Also please provide the make/model of the RAID controller, the write
<span class="moz-txt-citetags">> </span>cache size and if it is indeed enabled and working, as well as any
<span class="moz-txt-citetags">> </span>errors, if any, logged by the controller in dmesg or elsewhere in Linux,
<span class="moz-txt-citetags">> </span>or in the controller firmware.
<span class="moz-txt-citetags">> </span>
</pre>
</blockquote>
</blockquote>
The RAID box is an Infortrend S16S-G1030 with 512MB cache and a fully
functional battery. I couldn't find any details about the internal
RAID implementation used by Infortrend. The array is SAS attached to
an LSI HBA (SAS2008 PCI-Express Fusion-MPT SAS-2). <br>
<br>
The system hardware is a SuperMicro quad 8-core XEON E7-4820 2.0GHz
with 128 GB of ram, hyper-theading enabled. (This is something that I
inherited. There is no doubt that it is overkill.)<br>
<blockquote cite="mid:5141C8C1.2080903@hardwarefreak.com" type="cite">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">>> </span></pre>
</blockquote>
</blockquote>
</blockquote>
Another bit of information that you didn't ask about is the I/O
scheduler algorithm. I just checked and found it set to 'cfq',
although I though I had set it to 'noop' via a kernel parameter in GRUB.<br>
<br>
Also, some observations about the cp -al: In parallel to investigating
hardware/OS/filesystem issue I have done some experiments with cp -al.
It hurts to have 64 cores available and see cp -al running the wheels
off just one, with a couple others slightly active with system level
duties. So I tried some experiments where I copied smaller segments of
the file tree in parallel (using make -j). I haven't had the chance to
fully play this out, but these parallel cp invocations completed very
quickly. So it would appear that the cp command itself may bog down
with such a large file tree. I haven't had a chance to tear apart the
source code or do any profiling to see if there are any obvious problems
there.<br>
<br>
Lastly, I will mention that I see almost 0% wa when watching top. <br>
</body>
</html>