On Fri, Aug 08, 2008 at 10:52:50AM +0530, Bhagi rathi wrote:
> On Fri, Aug 8, 2008 at 6:01 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > On Thu, Aug 07, 2008 at 10:53:55PM +0530, Bhagi rathi wrote:
> > > On Thu, Aug 7, 2008 at 1:52 AM, Dave Chinner <david@xxxxxxxxxxxxx>
> > wrote:
> > >
> > > > On Wed, Aug 06, 2008 at 02:55:28PM -0500, Eric Sandeen wrote:
> > > > > Bhagi rathi wrote:
> > > > > > Why are we going to block for ever? Mounting a file-system
> > > > > > requires in-core log space buffers, reading of other buffers
> > > > > > which needs allocation of memory greater than per ag
> > > > > > structures.
> > > > .....
> > > > > In general KM_MAYFAIL sounds like a good plan when you can handle the
> > > > > failure gracefully, I think.
> > > >
> > > > Yes, and that is the long term plan - to remove all KM_SLEEP
> > > > allocations from XFS and allow them to fail gracefully. There's
> > > > lots of work needed before we get there, though. e.g.
> > > > right now we cannot survive an ENOMEM error in a transaction....
> > >
> > >
> > > I am not sure that we are solving right problem. Isn't the above is
> > > fall-out
> > > of XFS needing memory to clean dirty memory?
> > We can't avoid that. It is inherent in the design of XFS. And the
> > amount of memory is not easily bounded so existing solutions like
> > wrapping slabs in mempools don't work, either.
> Interesting. The only dirty items are xfs inodes, quota's and
> meta-data buffers. We can fixed number of these items and
> ensure pushing of data one after the other in the case of crunch.
No we can't.
> What are the objects of dirty data where the memory is not bounded?
e.g. btree (bmbt and alloc bno+cnt) buffers may need to be read from
disk during delayed allocation. Seeing as there is no guarantee that
the memory the buffers use will be released after the I/O is
completed we don't provide the guarantee mempools require (i.e
allocations always being freed a short while later so the system
always makes progress). Yes, we can say it is bounded for a single
allocation, but cleaning an inode may require an unbound number of
allocations (think single block fragmentations)....
And then when you consider write back of thousands of dirty inodes
at a time that all require mulitple allocations, the memory required
by the delalloc process for inodes can effectively be considered