[Top] [All Lists]

Re: Xfs_repair and journalling -- EXT4 journal replay discussion

To: stan@xxxxxxxxxxxxxxxxx
Subject: Re: Xfs_repair and journalling -- EXT4 journal replay discussion
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Sat, 30 Mar 2013 12:40:10 -0500
Cc: Subranshu Patel <spatel.ml@xxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <5156DF72.1090703@xxxxxxxxxxxxxxxxx>
References: <CAEUQceh-Xcabr0KErxF6EAdafDDP1PY_AeHwgYB82QeUdyGp-g@xxxxxxxxxxxxxx> <5147E360.10605@xxxxxxxxxxx> <5148037B.5010706@xxxxxxxxxxxxxxxxx> (sfid-20130319_091205_759083_BA6F1AD2) <201303190924.29362.Martin@xxxxxxxxxxxx> <51483A7D.9050202@xxxxxxxxxxxxxxxxx> <5156DF72.1090703@xxxxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130307 Thunderbird/17.0.4
On 3/30/13 7:49 AM, Stan Hoeppner wrote:
> On 3/19/2013 5:14 AM, Stan Hoeppner wrote:
>> On 3/19/2013 3:24 AM, Martin Steigerwald wrote:
> ...
>>> Heck, even I was confused at first. Cause the manpage of fsck.ext4 IMHO is 
>>> not really clear about that topic to say the least. I tested it out for a 
>>> reason.
>> I already contacted Ted off list hoping he can point me to the relevant
>> kernel documentation, so I don't make such a mistake again with EXT.
> Ok, so here's the skinny on the source of our confusion WRT how/when
> EXT4 replays journals, and it's rather interesting.  Ted Ts'o explained
> the following.

Where was this, out of curiosity?

> The EXT4 kernel module does have code to perform journal replay, but it
> is rarely executed.  The reasons for this are:
> 1.  EXT4 journal replay can take a lot of time (whereas XFS is instant)
> 2.  EXT4 systems tend to have multiple filesystems, often one per drive
>     (whereas XFS systems tend to have few filesystems)

Those are, I think, gross generalizations.  Journal replay takes as
long as it takes to replay all the IO required, which can vary greatly.
And TBH I have no idea where the notion came from that systems have many
ext4 filesystems but few xfs filesystems.

> 3.  Linux mounts filesystems serially during startup

I think that is correct.

> To prevent potentially lengthy boot times, the init scripts run e2fsck
> to replay all EXT4 filesystem journals in parallel, well before the
> mount stage.  

I'd never heard this rationale before, but I could believe that maybe
parallel log replays from userspace are faster, although it probably
depends a lot on how many spindles are available to do the work - fsck
avoids running in parallel for filesystems on the same physical disk,
at least according to the manpage.

> Thus the only case where the EXT4 kernel module performs
> journal replay is when doing a mount while the system is running, e.g.
> USB hard drive.

Or when running xfstests ;)  Technically, it does replay when the kernel
mount code finds a dirty log.  That's interesting, though, I hadn't thought
about how most systems probably don't get a ton of coverage of kernelspace
ext[34] log replay.

> There are other reasons e2fsck was chosen to perform journal replay at
> boot in addition to the speed issue, but as I understood Ted this is the
> main reason.

Ok, I can see some rationale to parallel userspace log replays; it'd be
interesting to actually measure that result, though.


<Prev in Thread] Current Thread [Next in Thread>