[Top] [All Lists]

Re: [RFC v2] Unicode/UTF-8 support for XFS

To: Olaf Weber <olaf@xxxxxxx>
Subject: Re: [RFC v2] Unicode/UTF-8 support for XFS
From: Jeremy Allison <jra@xxxxxxxxx>
Date: Fri, 26 Sep 2014 12:46:56 -0700
Cc: Jeremy Allison <jra@xxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, Ben Myers <bpm@xxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, tinguely@xxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <5425C067.7080904@xxxxxxx>
References: <20140918195650.GI19952@xxxxxxx> <20140922222611.GZ4322@dastard> <5422C540.1060007@xxxxxxx> <20140924231024.GA4758@dastard> <54257D3F.70302@xxxxxxx> <20140926165605.GA25274@xxxxxxxxxxxxx> <20140926170407.GB6012@samba2> <5425C067.7080904@xxxxxxx>
Reply-to: Jeremy Allison <jra@xxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri, Sep 26, 2014 at 09:37:11PM +0200, Olaf Weber wrote:
> My argument against "mount time case-insensitivity" and for "mkfs
> time case-insensitivity" is related to switching from the
> case-sensitive domain to the case-insensitive one.
> For case-sensitive, from "README" to "readme" there are 64 different
> possible filenames.  Let's say you create 63 out of these 64. Now
> remount the filesystem case-insensitive, and try to open by the 64th
> version of "readme". It is not an exact match for any of the 63
> candidate files, and a case-insensitive match to all 63 candidate
> files. Which of these 63 files should be opened, and why that one in
> particular?

I'm ok with "mkfs time case-insensitivity" - really !
Most of my OEMs would set that and claim victory (few
of them care much about NFS semantics :-).

> >Having CI matching can speed up Samba operations by a
> >factor of 10 on large directories (warning, number made
> >up, depending on the number of entries per dir :-).
> I really want that to be true, but the proof of the pudding...

No it really *is* true. The reason I can't give
exact numbers is it depends on the number of entries.

Remember, for every cache *miss*, we have to scan
the entire directory.

So a user asks for README, and we attempt that
and it fails. So now we have to enumerate the
entire directory to see if READMe (or any other
case varient) exists.

Now do that in a directory with 10, 100, 1000,
.... 10000000 existing files (don't laugh, I've
seen an application for Music files that did
*exactly* that). On a case insensitive filesystem
you just request README and you're done.

Certain vendors who shall remain nameless :-)
created test cases of just this example to
show how much storage on Linux sucks. Not
a happy camper about that - and telling them
to use ZFS on FreeBSD or Solaris just doesn't
feel right :-).


<Prev in Thread] Current Thread [Next in Thread>