| To: | Christoph Hellwig <hch@xxxxxxxxxxxxx> |
|---|---|
| Subject: | Re: RFC: Case-insensitive support for XFS |
| From: | Anton Altaparmakov <aia21@xxxxxxxxx> |
| Date: | Fri, 5 Oct 2007 20:10:23 +0100 |
| Cc: | Barry Naujok <bnaujok@xxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, urban@xxxxxxxxxxxxxx |
| In-reply-to: | <20071005154442.GA6432@infradead.org> |
| References: | <op.ty6361ut3jf8g2@pc-bnaujok.melbourne.sgi.com> <op.tzpbqspl3jf8g2@pc-bnaujok.melbourne.sgi.com> <20071005154442.GA6432@infradead.org> |
| Sender: | xfs-bounce@xxxxxxxxxxx |
Hi, On 5 Oct 2007, at 16:44, Christoph Hellwig wrote: [Adding -fsdevel because some of the things touched here might be of broader interest and Urban because his name is on nls_utf8.c] Where did that come from? NTFS does not have its own dcache! It doesn't have its own dentry operations either... NTFS uses the default ones... All the case insensitivity handling "cleverness" is done inside ntfs_lookup(), i.e. the NTFS directory inode operation ->lookup. stores its unicode case table on disk. This allows the filesystem to migrate to newer forms of Unicode at the time of formatting the filesystem. Eg. Windows Vista now supports Unicode 5.0 while older version would support an earlier version of Unicode. Linux's version of NTFS case table is implemented in fs/ntfs/upcase.c defined as default_upcase.
Windows Vista uses a different table (the content is actually significantly different). My not yet allowed to be released NTFS driver uses the Vista table by default. But the default does not matter for NTFS. At mount time, the upcase table stored on the volume is read into memory and compared to the default one. If they match perfectly the default one is used (it is reference counted and discarded when not in use) and if they do not match the one from the volume is used. So we support both NT4/2k/XP and Vista style volumes fine no matter what default table we use... The only thing is that for each non-default table we waste 128kiB of vmalloc()ed kernel memory thus if you mount 10 NTFS volumes with non- default table we are wasting 1MiB of data... Because ntfs uses 16bit wide chars it prefers to use it's own tables. I'm not sure it's a that good idea. The upcase table is used during the case insensitive ->lookup and if you have the wrong table it will make the traversal in the directory b-tree go wrong and so you may not find files that actually exist when doing a ->lookup! So yes it is not only a good idea but an absolutely essential idea! You have to use the same upcase table for a volume as the upcase table with which the names on the volume were created otherwise your b-trees are screwed if they use any characters where the upper casing between the upcase table used when writing and the upcase table used when doing the lookup are not matched. JFS also has wide-char names on There is nothing efficient about using u16 in memory AFAIK. In fact for majority of the time it just means you use twice the memory per string... FWIW Mac OS X uses utf8 in the kernel and so does HFS(+) and I can't see anything wrong with that. And Windows uses u16 (little endian) and so does NTFS. So there is precedent for doing both internally... What are the reasons for suggesting that it would be more efficient to use u16 internally? Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer, http://www.linux-ntfs.org/ |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: RFC: Case-insensitive support for XFS, Nicholas Miell |
|---|---|
| Next by Date: | Re: RFC: Case-insensitive support for XFS, Brad Boyer |
| Previous by Thread: | Re: RFC: Case-insensitive support for XFS, Barry Naujok |
| Next by Thread: | Re: RFC: Case-insensitive support for XFS, Brad Boyer |
| Indexes: | [Date] [Thread] [Top] [All Lists] |