On Thu, Jan 29, 2015 at 04:15:54PM -0600, Eric Sandeen wrote:
> On 1/29/15 3:59 PM, Gerard Beekmans wrote:
> I'm sure it's not related to this issue (unless it was very recently grown?
> Was it grown shortly before the failures?)
> Hm, it would have started at 4 AGs by default, and it's the 5th one that
> looks bad; maybe that's a clue. Are agf 6, 7, 8 etc also full of 0s?
Gerard is using the default mount options, so XFS is issuing cache
flushes and FUA with log writes. Hence if the new AG headers are
zero yet the superblock says they are valid, then that's a storage
In more detail: we force the new AGs to be written to disk
synchronously during the growfs operation before we commit the
transaction. The superblock with the larger AG count can only get on
disk after the transaction has been written to the log. Log writes
trigger a storge device cache flush, which results in the IO
new AG header IO
Device cache flush
(new AG headers guaranteed to be on disk)
journal write (FUA)
(journal write guaranteed to be on disk)
superblock write IO.
Hence if the superblock is showing 25 AGs and the new ags from 4-25
are not found on disk then either:
a) if the grow was very recent the storage is not obeying
cache flushes and hence breaking fundamental IO ordering
b) if the growfs happened long ago, the storage has lost the
data that was written to stable media...