xfs
[Top] [All Lists]

op-journaled fs, journal size and storage speeds

To: Linux fs XFS <linux-xfs@xxxxxxxxxxx>, Linux fs JFS <jfs-discussion@xxxxxxxxxxxxxxxxxxxxx>
Subject: op-journaled fs, journal size and storage speeds
From: pg_xf2@xxxxxxxxxxxxxxxxx (Peter Grandi)
Date: Sat, 30 Apr 2011 15:51:43 +0100
Been thinking about journals and RAID6s and SSDs.

In particular for file system designs like JFS and XFS that do
operation journaling (while ext[34] do block journaling).

The issue is: journal size?

It seems to me that adopting as guideline a percent of the
filesystem is very wrong, and so I have been using a rule of
thumb like one second of expected transfer rate, so "in flight"
updates are never much behind.

But even at a single disk *sequential* transfer rate of say
80MB/s average, a journal that contains operation records could
conceivably hold dozens if not hundreds of thousands of pending
metadata updates, probably targeted at very widely scattered
locations on disk, and playing a journal fully could take a long
time.

So the idea would be that the relevant transfer rate would be
the *random* one, and since that is around 4MB/s per single
disk, journal sizes would end up pretty small. But many people
allocate very large (at least compared to that) journals.

This seems to me a fairly bad idea, because then the journal
becomes a massive hot spot on the disk and draws the disk arm
like black hole. I suspect that operations should not stay on
the journal for a long time. However if the journal is too small
processes that do metadata updates start to hang on it.

So some questions for which I have guesses but not good answers:

  * What should journal size be proportional to?
  * What is the downside of a too small journal?
  * What is the downside of a too large journal other than space?

Again I expect answers to be very different for ext[34] but I am
asking for operation-journaling file system designs like JFS and
XFS.

BTW, another consideration is that for filesystems that are
fairly journal-intensive, putting the journal on a low traffic
storage device can have large benefits.

But if they can be pretty small, I wonder whether putting the
journals of several filesystems on the same storage device then
becomes a sensible option as the locality will be quite narrow
(e.g. a single physical cylinder) or it could be wortwhile like
the database people do to journal to battery-backed RAM.

<Prev in Thread] Current Thread [Next in Thread>
  • op-journaled fs, journal size and storage speeds, Peter Grandi <=