xfs
[Top] [All Lists]

Re: definitions for /proc/fs/xfs/stat

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: definitions for /proc/fs/xfs/stat
From: Mark Seger <mjseger@xxxxxxxxx>
Date: Sat, 15 Jun 2013 06:35:02 -0400
Cc: Nathan Scott <nathans@xxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=L6ccBgOFEdQn2YkbghmCUatw1yYykDORMOCkYYJq7nI=; b=WQrqjUUDCHqB/ZTVWbvF+G5utkkOA6zIIuO9/0/n1KPfo2uEyT0iZGQBI6kIWsFSSl gFr4wkaFf9SXbg1xeWKxuGsuyOPjeRA5NV809Z6Nr+ZTZUi+XVkGDmYf606c2rYMION/ G2QyJVHABw6DEAeDcfb3uaw4VMtxhqHmequSg3Adfr+t44eGhjIEuz8fbNsAaKppEa/r QPC+C4Q9A16OyZZIqfkvvqZfdcoedx+Wjl7pvPQDLdNN98BNwryHr+fYJff3mlyhfNKj A5V8oLPuEnkIadiVJ7+lGHUTPsK2UnubZa9EoHN2gYx7plB1owB0hObcNkREG7dKhGH2 LgsQ==
In-reply-to: <20130615020414.GB29338@dastard>
References: <CAC2B=ZFP_Fg34aFpk857stgB7MGcrYs9tybRS-ttw1CXNeU41Q@xxxxxxxxxxxxxx> <91017249.1356192.1371248207334.JavaMail.root@xxxxxxxxxx> <CAC2B=ZHYV6d-1PO_=-jXsQidZnYPHVwcVAaQh2mxJt=5K03AEA@xxxxxxxxxxxxxx> <504625587.1365681.1371255450937.JavaMail.root@xxxxxxxxxx> <CAC2B=ZF+eMyNLPQmhA_onDPEUqgNfcgCdZVvobNH9pofvioN7Q@xxxxxxxxxxxxxx> <20130615020414.GB29338@dastard>
Basically everything do it with collectl, a tool I wrote and opensourced almost 10 years ago.  it's numbers are very accurate - I've compared with iostat on numerous occasions whenever I might have had doubts and they always agree.  Since both tools get their data from the same place, /proc/diskstats, it's hard for them not to agree AND its numbers also agree with /proc/fs/xfs.

Here's an example of comparing the two on a short run, leaving off the -m since collectl reports its output in KB.

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdc               0.00     0.00    0.00  494.00     0.00 126464.00   512.00     0.11    0.22    0.00    0.22   0.22  11.00

#                   <---------reads---------><---------writes---------><--------averages--------> Pct
#Time     Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
10:18:32 sdc1             0      0    0    0  127488      0  498  256     256     1     0      0    7
10:18:33 sdc1             0      0    0    0  118784      0  464  256     256     1     0      0    4

for grins I also ran a set of numbers at a monitoring interval of 0.2 seconds just to see if they were steady and they are:

#                       <---------reads---------><---------writes---------><--------averages--------> Pct
#Time         Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
10:19:50.601 sdc1             0      0    0    0     768      0    3  256     256     0     0      0    0
10:19:50.801 sdc1             0      0    0    0   23296      0   91  256     256     1     0      0   19
10:19:51.001 sdc1             0      0    0    0   32256      0  126  256     256     1     0      0   14
10:19:51.201 sdc1             0      0    0    0   29696      0  116  256     256     1     0      0   19
10:19:51.401 sdc1             0      0    0    0   30464      0  119  256     256     1     0      0    4
10:19:51.601 sdc1             0      0    0    0   32768      0  128  256     256     1     0      0   14

but back to the problem at hand and that's the question why is this happening?

To restate what's going on, I have a very simple script that I'm duplicating what openstack swift is doing, namely to create a file with mkstmp and than running an falloc against it.  The files are being created with a size of zero but it seems that xfs is generating a ton of logging activity.  I had read your posted back in 2011 about speculative preallocation and can't help but wonder if that's what hitting me here.  I also saw where system memory can come into play and this box has 192GB and 12 hyperthreaded cores.

I also tried one more run without falloc, this is creating 10000 1K files, which should be about 10MB and it looks like it's still doing 140MB of I/O which still feels like a lot but at least it's less than the 

#                   <---------reads---------><---------writes---------><--------averages--------> Pct
#Time     Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
10:29:20 sdc1             0      0    0    0   89608      0  351  255     255     1     0      0   11
10:29:21 sdc1             0      0    0    0   55296      0  216  256     256     1     0      0    5

and to repeat the full run with falloc:

# DISK STATISTICS (/sec)
#                   <---------reads---------><---------writes---------><--------averages--------> Pct
#Time     Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
10:30:50 sdc1             0      0    0    0   56064      0  219  256     256     1     0      0    2
10:30:51 sdc1             0      0    0    0  409720    148 1622  253     252     1     0      0   26
10:30:52 sdc1             0      0    0    0  453240    144 1796  252     252     1     0      0   36
10:30:53 sdc1             0      0    0    0  441768    298 1800  245     245     1     0      0   37
10:30:54 sdc1             0      0    0    0  455576    144 1813  251     251     1     0      0   25
10:30:55 sdc1             0      0    0    0  453532    145 1805  251     251     1     0      0   35
10:30:56 sdc1             0      0    0    0  307352    145 1233  249     249     1     0      0   17
10:30:57 sdc1             0      0    0    0       0      0    0    0       0     0     0      0    0

If there is anything more I can provide I'll be happy to do so.  Actually I should point out I can easily generate graphs and if you'd like to see some examples I can provide those too.  Also, if there is anything I can report from /proc/fs/xfs I can relatively easily do that as well and display it side by side with the disk I/O.

-mark


On Fri, Jun 14, 2013 at 10:04 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
On Fri, Jun 14, 2013 at 09:55:17PM -0400, Mark Seger wrote:
> I'm doing 1 second samples and the rates are very steady.  The reason I
> ended up at this level of testing was I had done a sustained test for 2
> minutes at about 5MB/sec and was seeing over 500MB/sec going to the disk,
> again sampling at 1-second intervals.  I'd be happy to provide detailed
> output and can even sample more frequently if you like.

Where are you getting your IO throughput numbers from?

How do they compare to, say, the output of `iostat -d -x -m 1`?

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>