Dave Chinner wrote:
On Fri, Oct 24, 2008 at 01:08:55PM +1000, Lachlan McIlroy wrote:
Christoph Hellwig wrote:
On Thu, Oct 23, 2008 at 07:17:30PM +1000, Lachlan McIlroy wrote:
another problem with latest xfs
Is this with the 2.6.27-based ptools/cvs tree or with the 2.6.28 based
git tree? It does looks more like a VM issue than a XFS issue to me.
It's with the 2.6.27-rc8 based ptools tree. Prior to checking
in these patches:
Can't lock inodes in radix tree preload region
stop using xfs_itobp in xfs_bulkstat
free partially initialized inodes using destroy_inode
I was able to stress a system for about 4 hours before it ran out
of memory. Now I hit the deadlock within a few minutes. I need
to roll back to find which patch changed the behaviour.
Ok, I think I've found the regression - it's introduced by the AIL
cursor modifications. The patch below has been running for 15
minutes now on my UML box that would have hung in a couple of
minutes otherwise.
Yep, looks good here too. My test system has been up at least an hour
and still chugging.
FYI, the way I found this was:
- put a breakpoint on xfs_create() once the fs hung
- `touch /mnt/xfs2/fred` to trigger the break point.
- look at:
- mp->m_ail->xa_target
- mp->m_ail->xa_ail.next->li_lsn
- mp->m_log->l_tail_lsn
which indicated the push target was way ahead the
tail of the log, so AIL pushing was obviously not
happening otherwise we'd be making progress.
- added breakpoint on xfsaild_push() and continued
- xfsaild_push() bp triggered, looked at *last_lsn
and found it way behind the tail of the log (like
3 cycle behind), which meant that would return
NULL instead of the first object and AIL pushing
would abort. Confirmed with single stepping.
Cheers,
Dave.
|