On Thu, Sep 17, 2015 at 09:37:09PM +0000, Earl, Joshua P wrote:
> Hi Brian,
>
> Sorry about the top posting thing... I'm not sure how to control
> that, is my replying somehow messing with that?
when everthing is backwards
to read the thread
it makes it hard
And please wrap your text at 72 columns.
> With good news, I seem to have figured out what was going on. I
> had a cron job which would run every 15 minutes which changed the
> permissions in a directory:
> chmod -R g+rwx /data/shared/homes/bjanto/*
> chmod -R g+rwx /data/shared/homes/lanastor/*
> chgrp -hR ilmn /data/nextseq/*
> chgrp -hR lab /data/shared/homes/*
So you are modifying a large amount of metadata every 15 minutes,
and then you have a problem with your 22-disk wide RAID6 array when
the metadata gets written back. metadata writeback is, by the nature
of metadata in a filesystem, done in small, isolated IOs that cause
large RAID5/6 arrays to do an stripe-wide RMW cycle on every IO.
> > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> > avgqu-sz await svctm %util
> > sda 0.29 3.61 5.78 3.58 0.10 0.03 28.27
> > 0.05 5.19 2.39 2.24
> > sdb 1.02 8.66 31.50 3.91 0.33 0.12 26.14
> > 5.94 167.54 27.47 97.25
> >
> > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
> > avgqu-sz await svctm %util
> > sda 0.00 1.60 0.00 2.00 0.00 0.01 14.40
> > 0.01 4.30 4.30 0.86
> > sdb 0.00 0.00 0.00 0.80 0.00 0.03 64.00
> > 6.46 6332.75 1250.00 100.00
That's pretty clear that your hardware raid array is taking over a
second per IO that requires a RMW cycle. So not a filesystem
problem...
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
|