[BACK]Return to todos.html CVS log [TXT][DIR] Up to [Development] / xfs-website.orig

File: [Development] / xfs-website.orig / todos.html (download) (as text)

Revision 1.18, Fri Dec 15 23:04:42 2000 UTC (16 years, 10 months ago) by xfs
Branch: MAIN
Changes since 1.17: +685 -154 lines


Modified Files:
 	todos.html

<& xfsTemplate,top=>1,side=>1 &>

<h2><b><FONT FACE="ARIAL NARROW, HELVETICA" SIZE="5">Work item list as of 12/15/2000</FONT></b></h2>

<P>
The current work items for XFS for Linux are listed below.
Many of the items on the list have been classified according to
the type of issue they address and according to priority.
</P>

<p>
The classification types are as follows:
<OL TYPE=A START=A>

<LI>
Missing functionality
</LI>

<LI>
Needed for compatibility with a bigger set of Irix filesystems
</LI>

<LI>
Will help getting into Linus's tree
</LI>

<LI>
CXFS related work
</LI>

<LI>
Linux VM changes may affect implementation
</LI>
</OL>
</p>

<p>
The items on the list have been prioritized.  There are four
prioritization levels: P1, P2, P3, and P4.
</p>

<h3><b>Work Items</b></h3>

<p>
The following work items remain for XFS for Linux.  The
items are described in detail following the summary.
<p>
<OL>

<LI>
<A HREF="#_O_SYNC">O_SYNC</A>
</LI>

<LI>
<A HREF="#_O_DIRECT">O_DIRECT</A>
</LI>

<LI>
<A HREF="#_RikvanRiel">Rik van Riel VM integration - add flush method</A>
</LI>

<LI>
<A HREF="#_xfs-cmds">Break out xfs-cmds into more than one package</A>
</LI>

<LI>
<A HREF="#_multiblocksize">Multiple filesystem blocksize support</A>
</LI>

<LI>
<A HREF="#_pagebuf-tune">pagebuf tuning (write throttling, memory pressure)</A>
</LI>

<LI>
<A HREF="#_ide-kiobuf">IDE kiobuf support + Changes to kiobuf request queue implementation</A>
</LI>

<LI>
<A HREF="#_volmanage">Software volume management via kiobufs</A>
</LI>

<LI>
<A HREF="#_ext-attrib">Finalize a kernel interface for extended attributes</A>
</LI>

<LI>
<A HREF="#_64-bit">64 bit XFS</A>
</LI>

<LI>
<A HREF="#_xfs-installer">Modify an installer for XFS root</A>
</LI>

<LI>
<A HREF="#_unwritten-ext">Unwritten extents</A>
</LI>

<LI>
<A HREF="#_2-tbyte">Address the 2 Tbyte device limitation on IA32 Linux</A>
</LI>

<LI>
<A HREF="#_quotas">Quotas</A>
</LI>

<LI>
<A HREF="#_shutdowndisk">Shutdown on disk error</A>
</LI>

<LI>
<A HREF="#_compiler-ind">Compiler independence</A>
</LI>

<LI>
<A HREF="#_grio">GRIO</A>
</LI>

<LI>
<A HREF="#_dmapi">DMAPI</A>
</LI>

<LI>
<A HREF="#_spinlockbug">Spinlock bug on some Linux platforms (sparc for example)</A>
</LI>

<LI>
<A HREF="#_native-lock">Use Linux native locking primitives</A>
</LI>

<LI>
<A HREF="#_out-of-mem">Better out of memory handling</A>
</LI>

<LI>
<A HREF="#_XFS-dist">Include XFS in standard Linux distributions</A>
</LI>

<LI>
<A HREF="#_XFS-Linus">Get XFS into Linus's kernel</A>
</LI>


</OL>

<p>
<OL>
<b>

<LI>
<A NAME="_O_SYNC">
O_SYNC</b>
</A>
<p>
Classification:	A, C, D, E
<br>
Priority: P1
<p>
O_SYNC I/O will return control to user space without ensuring that data
is on disk. This will potentially break some applications.
<p>
The Irix code will not help much here in that the buffering implementation
is so different. Fastest fix is to do an fsync after the write has buffered
data; ideally we only want to wait for our data though.
<p>
75% pagebuf work, 25% additional XFS code.
<p>
Owner: Eric Sandeen (<i><a href="mailto:sandeen@sgi.com">sandeen@sgi.com</a></i>)
</LI>
<p>
<b>
<LI>
<A NAME="_O_DIRECT">
O_DIRECT</b>
</A>
<p>
Classification: A, D, E
<br>
Priority: P1
<p>
Did some basic read code which could also be the core of a write path
too. This did not do cache coherency with buffered I/O. The Irix approach
to this was to flush buffered data to disk before a read and to flush and
invalidate it before a write. Stephen Tweedie has some fairly grandiose
schemes for doing this in an fs independent way using kiobufs. This will
not show up before 2.5 so we need our own implementation. One concept we
can copy from his idea is to detect cached data and use it in the I/O - i.e.
on read copy from the cache not the disk, on write copy from user memory
to cache then do I/O to disk.
<p>
We also have more locking in place than Irix direct writes have - the
Linux inode semaphore will single thread direct writes as it stands.
Without changing the fs independent code we would have to drop locks and
reobtain them in the xfs code.
<p>
Owner:
Russell Cattelan (<i><a href="mailto:cattelan@sgi.com">cattelan@sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_RikvanRiel">
Rik van Riel VM integration - add flush method</b>
</A>
<p>
Classification: C
<br>
Priority: P1
<p>
The test9 kernel has a new VM layer - this will affect the XFS code. Previous
discussions about the VM system had talked about abstracting away the use
of buffer heads to manage write ordering, and adding a flush method where
the filesystem gets to choose what gets flushed. This fits in pretty well
with the way we write delalloc data now. We should maybe consider joining
this effort and implementing the flush method. This will help with some
of the performance issues.
<p>
Owner:
Rajagopal Ananthanarayanan (<i><a href="mailto:ananth@engr.sgi.com">ananth@engr.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_xfs-cmds">
Break out xfs-cmds into more than one package</b>
</A>
<p>
Priority: P1
<p>
We have the situation currently with the user tools where we
have a number of "unstable" interfaces which are used in some of
the tools and will destabilize the tools (have done before
and will do again).  We should rearrange what we
currently have in xfs-cmds into more than one package.  It was
always known that this _must_ happen for those interfaces which
are not specific to XFS (libattr, libacl, libdm) in order for them
to be more widely accepted, and the way we ship these components
in the xfs-cmds package currently is a recipe for packaging
dependency headaches down the track.
<p>
Owner:
Nathan Scott (<i><a href="mailto:nathans@engr.sgi.com">nathans@engr.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_multiblocksize">
Multiple filesystem blocksize support</b>
</A>
<p>
Classification: A, B, C, E
<br>
Priority: P1, P2, P3
<br>
<p>
We can only support filesystems with a block size of 1 page, the page
size is architecture specific. This is more pagebuf work for the most
part, although once we have metadata chunks bigger than a single system
page size we have some problems to solve.
<p>
We are using pages to cache metadata, pages are allocated one at a time,
so each page sized chunk of memory usually is not adjacent in the address
space to the page covering the next block of the disk. We do have some
code to do memory remapping in the kernel to get around this. However,
this is code which would never be accepted into Linus's tree. In general
there is major resistance to doing address space remapping - it is
fairly expensive, and impacts the whole system, not just the thread
doing the remapping.
<p>
We already have some metadata bigger than a page - inode clusters. However,
because these clusters are actually just arrays of fixed sized objects,
we do not make accesses across page boundaries and it was fairly simple
to modify xfs to not need the inode buffers to appear as one chunk of memory.
In the case of directory blocks and other structures, this is not going
to work and another solution must be found.
<p>
Block size < page size shouldn't be too hard.  Block size > page size will
require a lot of work.  Could map multiple pages together.  Could kmemalloc
a pool of pages.  Could do a page cache size change using 64K blocks to solve
this.  Key is getting contiguous chunks of memory for this.  Kanoj Sarcar would
be a reference for this.
<p>
5.1  16K block size support (for SN-IA64) [P1]
<p>
5.2  Block size < page size support [P2]
<p>
5.3  Block size > page size support [P3]
<p>
Owners:
Russell Cattelan (<i><a href="mailto:cattelan@sgi.com">cattelan@sgi.com</a></i>),
Glen Overby (<i><a href="mailto:overby@sgi.com">overby@sgi.com</a></i>),
Rajagopal Ananthanarayanan (ananth@engr.sgi.com)
<p>
</LI>
<b>
<LI>
<A NAME="_pagebuf-tune">
pagebuf tuning (write throttling, memory pressure)</b>
</A>
<p>
Priority: P2
<p>
There are a number of performance things we can do to pagebuf.
We can also do some things which will benefit XVM performance.
<p>
<UL>
<LI TYPE=DISC>
We do read-ahead synchronously - when a page is missing we read
several pages, but the users read cannot be satisfied until all
the pages are present.
</LI>
<LI>
Only one extent is processed per call to bmap - Irix does several.
</LI>
<LI>
We are probably doing more page locking than is required.
</LI>
<LI>
We cluster writes on delalloc conversion, but not on rewriting data.
</LI>
<LI>
When dirtying new pages we do not obey any reservation scheme until after
we have consumed the pages. We should probably block before dirtying pages.
The chunk cache in Irix blocks requests for new data if a limit on delalloc
memory has been hit.
</LI>
</UL>
<p>
Could use EAGAIN to solve this.  Is a potential deadlock problem.  Ananth is
working on a patch.
<p>
Owner:
Rajagopal Ananthanarayanan (<i><a href="mailto:ananth@engr.sgi.com">ananth@engr.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_ide-kiobuf">
IDE kiobuf support + Changes to kiobuf request queue implementation</b>
</A>
<p>
Priority: P2
<p>
Jens Axboe is working on the IDE support.  Almost ready for inclusion in the
tree.
<p>
The generic kiobuf changes to the block I/O request layer are a
temporary solution and will most likely be rewritten for 2.5.
<p>
Owner:
Martin Petersen (<i><a href="mailto:mkp@linuxcare.com">mkp@linuxcare.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_volmanage">
Software volume management via kiobufs</b>
</A>
<p>
Priority: P2
<p>
Except for RAID-5, both MD and LVM will be ready and kiobuf-aware soon.
<p>
Owner:
Martin Petersen (<i><a href="mailto:mkp@linuxcare.com">mkp@linuxcare.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_ext-attrib">
Finalize a kernel interface for extended attributes</b>
</A>
<p>
Classification:	C	
<br>
Priority: P2
<p>
Linux has no defined interface for manipulating extended attributes. We
have added all the Irix system calls to our tree, but not reserved system
call numbers for them. This means that we have to change our numbers every
time we move to a new kernel version and someone else has added system
calls.
<p>
At the same time, there is another project http://acl.bestbits.at/ which
is working on an ACL interface for Linux, they also have extended 
attributes, and their own different API. We need to collaborate with
them on a common API (ours has more calls than theirs).
<p>
There is also some discussion about a suitable API - or even should
an API be available to user space, for extended attributes. There
seem to be philosophical differences between some filesystem developers
and GUI developers who want to be able to tag arbitrary data to a file.
<p>
Another option is to obtain an XFS system call in the interim and push
all the calls though this while the final 'official' position is
worked out.
<p>
Regarding ACLs, merge Danny's ACL patch to provide this functionality in a
similar way to IRIX.  Once an 'official' interface has been decided on, use
that.
<p>
Owners:
Tim Shimmin (<i><a href="mailto:tes@engr.sgi.com">tes@engr.sgi.com</a></i>),
Andrew Gildfind (<i><a href="mailto:ajag@melbourne.sgi.com">tes@melbourne.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_64-bit">
64 bit XFS</b>
</A>
<p>
Priority: P2
<p>
XFS needs to be supported on IA64.  It should work on Alpha, Sparc, and MIPS64
first.  Try it out on 64 bit MIPS first.
<p>
Owners:
Martin Petersen (<i><a href="mailto:mkp@linuxcare.com">mkp@linuxcare.com</a></i>),
Rajagopal Ananthanarayanan (<i><a href="mailto:ananth@engr.sgi.com">ananth@engr.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_xfs-installer">
Modify an installer for XFS root</b>
</A>
<p>
Priority: P2
<p>
Need to support XFS as root.  Steve's been running this way for two months -
copied from ext2 (find/cpio), then edited fstab and lilo.conf.
<p>
Thomas Graichen has a mini-root capability.
<p>
Owners:
Tom Duffy (<i><a href="mailto:tduffy@engr.sgi.com">tduffy@engr.sgi.com</a></i>),
Russell Cattelan (<i><a href="mailto:cattelan@sgi.com">cattelan@sgi.com</a></i>),
Eric Sandeen (<i><a href="mailto:sandeen@sgi.com">sandeen@sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_unwritten-ext">
Unwritten extents</b>
</A>
<p>
Classification:	A, B, D, E
<br>
Priority: P3
<p>
This requires a transaction at I/O completion time for the first write
to an extent. In Irix, writes which flush cache go through the filesystem.
In Linux they do not have to. A proposed extension to the new VM just
introduced into the kernel could help here - a flush call which the VM
makes to the filesystem to tell it to push data out to disk. We really 
need clustering of writes for this to be effective, the callback should
be for as large a chunk of data as possible - otherwise lots of transactions
will get executed.
<p>
This includes zeroing allocated unwritten disk space.
<p>
Owners:
Glen Overby (<i><a href="mailto:overby@sgi.com">overby@sgi.com</a></i>),
Russell Cattelan (<i><a href="mailto:cattelan@sgi.com">cattelan@sgi.com</a></i>),
Rajagopal Ananthanarayanan (<i><a href="mailto:ananth@engr.sgi.com">ananth@engr.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_2-tbyte">
Address the 2 Tbyte device limitation on IA32 Linux</b>
</A>
<p>
Priority: P3
<p>
Inode numbers are 32 bits and block devices have a 2 Tbyte size limitation.
<p>
Extending the Linux inode number to greater than 32 bits is probably not
an option. Changing inode allocation on xfs to restrict inode clusters to
the lower 2 Tbytes of AGs would fix it for new filesystems on Linux, but
not for moving large filesystems from Irix.
<p>
If NFS gets changed to use opaque file handles then the inode number will
only be visible outside the filesystem in calls such as stat and getdents.
In this case we could use larger inode numbers internally and be left with
the issue of getting them out to user space correctly.  NFS opaque file handle
patches were done by Neil Brown.
<p>
On device addressing there may be some options in the kiobuf changes 
currently being worked on.
<p>
There are 4 bytes (32 bits) of address space in a 512 byte block.  dksc could
be a way around the 2 Tbyte device limitation.  Stephen Tweedie may also have
some ideas on this.
<p>
This is a 32 bit system problem since this field is 64 bits on 64 bit systems.
<p>
Owner:
Martin Petersen (<i><a href="mailto:mkp@linuxcare.com">mkp@linuxcare.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_quotas">
Quotas</b>
</A>
<p>
Classification: A, B, C
<br>
Priority: P3
<p>
There was talk of which quota implementation we should use, the one in
XFS, or the Linux implementation. I think we have to use the XFS on disk
format and the quota code within XFS as you really want to update quota
files transactionally with the filesystem modification which caused
them. I am not sure if it is possible to do this, but being able to use
the existing Linux quota utilities, or a modified version of them,
would be good.
<p>
14.1 User quotas
<p>
14.2 Group (& project?) quotas
<p>
Owner:
Nathan Scott (<i><a href="mailto:nathans@engr.sgi.com">nathans@engr.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_shutdowndisk">
Shutdown on disk error</b>
</A>
<p>
Classification: C
<br>
Priority: P3
<p>
XFS has support for shutting down a filesystem when it detects corruption
or I/O failures. This does not work on Linux right now - we need to get to
the point where we can unmount the filesystem without any disk I/O.
There are error injection calls in XFS which can be turned on to simulate
I/O errors, and options on fsstress to exercise them, this probably went
through syssgi though.
<p>
Owner:
Mark Nordstrand (<i><a href="mailto:mann@sgi.com">mann@sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_compiler-ind">
Compiler independence</b>
</A>
<p>
Classification: C
<br>
Priority: P4
<p>
We are currently stuck at a specific compiler level. There are parts of
XFS which do not compile with later compilers, and parts which generate
bad code. There are also other parts of the kernel which do not work
correctly with later compilers, but this should not stop us from fixing
our code.
<p>
This is a stretch goal and is very low priority.
<p>
Owner:
Russell Cattelan (<i><a href="mailto:cattelan@sgi.com">cattelan@sgi.com</a></i>)
</LI>
<p>
<b>
<LI>
<A NAME="_grio">
GRIO</b>
</A>
<p>
Priority: P4
<p>
Needs some extensions to the Linux device interface in terms of
request priorities.
<p>
Needs to be looked at more.  This should probably be a low priority feature.

</LI>
<p>
<b>
<LI>
<A NAME="_dmapi">
DMAPI</b>
</A>
<p>
Priority: P4
<p>
The internal DMAPI port for xfs and the external open XDSM project need to sync
up somehow.  XDSM may be in ReiserFS.  This will be a merge effort.  Don't
support this merge until the 2.5 kernel.
<p>
Owner: Dean Roehrich (<i><a href="mailto:roehrich@sgi.com">roehrich@sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_spinlockbug">
Spinlock bug on some Linux platforms (sparc for example)</b>
</A>
<p>
Classification:	C
<br>
Priority: P4
<p>
The way we pass around the saved interrupt state between function calls
will not work on some Linux platforms.
<p>
May be fixed - appears to be working.
<p>
Owner:
Martin Petersen (<i><a href="mailto:mkp@linuxcare.com">mkp@linuxcare.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_native-lock">
Use Linux native locking primitives</b>
</A>
<p>
Classification: C
<br>
Priority: P4
<p>
We implemented our own version of mrlocks, they are heavy weight when
compared to the Linux equivalents. The Linux equivalents are missing some
functionality, but code has been written for this. One downside is that
CXFS uses even more locking variants.
<p>
Ben LaHaise (bcrl@redhat) may have code for the additional lock functionality.
Using Linux locks might be important to get accepted into Linus's tree.
<p>
Owner:
Daniel Moore (<i><a href="mailto:dxm@melbourne.sgi.com">dxm@melbourne.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_out-of-mem">
Better out of memory handling</b>
</A>
<p>
Priority: P4
<p>
We can still die due to this. Memory allocations in pagebuf are more
a problem than those in XFS. In some failure cases XFS will attempt to
flush other memory users, pagebuf does not.
<p>
This may not be a problem.  Haven't seen anything in this area in a while.
May come back with file systems with block size > page size.
<p>
</LI>
<b>
<LI>
<A NAME="_XFS-dist">
Include XFS in standard Linux distributions</b>
</A>
<p>
Priority; P4
<p>
Should shoot for having XFS in standard Linux distributions.
<p>
Owners:
Martin Petersen (<i><a href="mailto:mkp@linuxcare.com">mkp@linuxcare.com</a></i>),
Rajagopal Ananthanarayanan (<i><a href="mailto:ananth@engr.sgi.com">ananth@engr.sgi.com</a></i>)
<p>
</LI>
<b>
<LI>
<A NAME="_XFS-Linus">
Get XFS into Linus's kernel</b>
</A>
<p>
Priority: P4
<p>
Should shoot for having XFS in a future kernel distribution.
<p>
Owners:
Martin Petersen (<i><a href="mailto:mkp@linuxcare.com">mkp@linuxcare.com</a></i>),
Rajagopal Ananthanarayanan (<i><a href="mailto:ananth@engr.sgi.com">ananth@engr.sgi.com</a></i>)

</LI>
</OL>
<!-- End Project Content -->



<& xfsTemplate,bottom=>1 &>