On Wed, 2002-06-26 at 12:36, Ian D. Hardy wrote:
Sorry, you dropped through the cracks there, and I am currently
sitting in the back of a talk at the Ottawa Linux Symposium, so
my coding time is a little limited this week. Next week there
will also be no one in the office (except the Australian
contingent).
Seems you have two issues, first file fragmentation and the
fact that fsr appears to have issues on a live system. Yes
I agree that running fsr during down time is the best solution
available right now. I do not know if you have an idle time
where you could actually run fsr on a known idle system. I
think it has options to run for a fixed amount of time
instead of running to completion. If you have known times
when activity is low you could possibly run fsr during this
period.
The fundamental issue is the amount of memory which one of
these fragmented files needs to hold its extents and the
ideal solution to to change how this memory is organized.
I have tinkered with the idea, but it is a non-trivial
project and I do not know when I might get to do it.
So I don't really have a code solution for you right now,
we need to look into what is happening to fsr under nfs
load, there should be something we can do to fix that
faster than the extent allocation code.
Steve
> Steve ++ Colleagues,
>
> Sorry to bother you (I understand that your busy & short
> staffed) - it would be useful to get some feedback on
> the problems/issues I raised a couple of weeks ago (I did
> note that you mentioned continuing problems due to
> fragmentation in another thread a few days ago). Do you
> have any idea if/when it should be possible to fix this
> problem? (I feel bad asking; but I'm getting preasure to
> look again at alternatives ...... which I'd rather not do - as I'm
> sure they have their own problems!).
>
> FYI: in the last ~20 days we've had another panic, that looked
> like another memory alloc error (I was on leave - so didn't
> get the full details) + a couple of system lockups (high
> load average and failing to fileserve); possibly not related.
> We reduced the load by introducing another server/filesystem
> (reiserfs !!) and moving some users onto that, today we had some
> scheduled maintenance time and did an ofline defrag of the
> XFS filesystem bringing it down from ~28% to <1%.
>
> Is there anything that I can do (remember I'm not a kernel
> writer/expert) to help, any further diagnostics that would
> help.
>
> Again many thanks for your help.
>
> Ian Hardy
>
> "Ian D. Hardy" wrote:
> >
> > Steve, ++++
> >
> > Some bad news, you may remember a couple of months ago I was
> > having problems with an NFS server that kept panicing, you
> > diagnosed a number of potential problems and patches; these
> > seemed to help! (many thanks) indeed this did work for ~45
> > days but recently we started to get a re-occurance of these
> > failures (again with crashes ~every 4-6 days).
> >
> > I then remembered that one of your suggestions/fixes concerned
> > potential problems with high levels of filesystem fragmentation
> > (at the time I ran 'xfs_fsr' to defrag the filesystem and you
> > also introduced a change to the CVS tree intended to help
> > prevent these crashes). Anyway, I then noticed that at the
> > time that my crashes started to re-occur that the filesystem
> > fragmentation had increased to ~60% (!!!) as reported via
> > 'xfs_db' (some files having >3000 extents). Is it possible that
> > there is still a issue in the XFS kernel (I'm currently using a
> > kernel compiled from the CVS tree as of 17th May)? Oops/ksymoops
> > output below.
> >
> > I've ran xfs_fsr to defrag the filesystem (getting it back
> > down to <1%), however, I've (again) had a couple of instances
> > in which shortly after running 'xfs_fsr' the filesystem has
> > gone off-line, the system log shows:
> >
> > Jun 6 07:53:18 blue00 kernel: xfs_force_shutdown(md(9,0),0x8) called from
> > line
> > 1039 of file xfs_trans.c. Return address = 0xc01e1fb9
> > Jun 6 07:53:18 blue00 kernel: Corruption of in-memory data detected.
> > Shutting
> > down filesystem: md(9,0)
> >
> > This was on an active filesystem. Is it possible that there
> > is a interaction between xfs_fsr and NFS? (I'd guess that xfs_fsr
> > may have a harder time detecting that a NFS client is accessing
> > a file than for local file access?.
> >
> > Our current intention, now we've defraged the filesystem is to
> > see how it goes for a couple of weeks if it seems better we will
> > then schedule regular (2-4 weeks) maintenance periods to run
> > 'xfs_fsr' on the filesytem without other filesystem activity. Does
> > this seem sensible too you?
> >
> > One question, if there is a problem associated with high levels
> > of fragmentation, is it overall high levels of fragmentation within
> > the filesytem that causes the failure or could it potentially be a
> > single file with a large number of extents? (I believe that we are
> > getting some highly fragmented files as we have a number of users/
> > applications that frequently open, apend and then close a log file,
> > these files soon get fragmented as the close releases any pre-allocated
> > blocks).
> >
> > For various reasons I've not managed to capture the Oops outupt from
> > most of the recent crashes but here's one (passed through ksymoops)
> > that I did get. May help to identify the problem (does look to me
> > as though it is related to filesystem extents/fragmentation,
> > 'xfs_iext_realloc' in the trace, though I'm not a kernel code expert!).
> >
> > invalid operand: 0000
> > CPU: 1
> > EIP: 0010:[<c012ff76>] Not tainted
> > Using defaults from ksymoops -t elf32-i386 -a i386
> > EFLAGS: 00010086
> > eax: 0000001d ebx: 00e350c0 ecx: 0000002e edx: 00000026
> > esi: f8d4b000 edi: f8d43000 ebp: 00010000 esp: f3ba5a44
> > ds: 0018 es: 0018 ss: 0018
> > Process nfsd (pid: 615, stackpage=f3ba5000)
> > Stack: c02b4082 f8d43000 00008000 f8d4b000 c0e20000 00010000 00000286
> > f8d43000
> > c01fd146 f8d43000 c01fd1a4 f8d43000 00010000 f4a5224c f8d43000
> > f8d4b000
> > 00008000 c0e18000 c01d04f1 f8d43000 00008000 00010000 00000001
> > ffffffff
> > Call Trace: [<c01fd146>] [<c01fd1a4>] [<c01d04f1>] [<c01a6996>] [<c01a5f12>]
> > [<c01d6474>] [<c01fd2e0>] [<c01aac15>] [<c01cf963>] [<c01e6e23>]
> > [<c01e6340>]
> > [<c026cf44>] [<c01f696f>] [<c01e6340>] [<c014f45c>] [<f8d2e973>]
> > [<f8d33f7b>]
> > [<f8d3b4a0>] [<f8d2b5d3>] [<f8d3b4a0>] [<f8cf6f89>] [<f8d3b400>]
> > [<f8d3aed8>]
> > [<f8d2b349>] [<c01057eb>]
> > Code: 0f 0b 83 c4 08 8b 15 2c 95 3f c0 8b 2c 1a 89 7c 24 14 b8 00
> >
> > >>EIP; c012ff76 <kfree+66/14c> <=====
> > Trace; c01fd146 <kmem_free+22/28>
> > Trace; c01fd1a4 <kmem_realloc+58/68>
> > Trace; c01d04f0 <xfs_iext_realloc+f0/108>
> > Trace; c01a6996 <xfs_bmap_delete_exlist+6a/74>
> > Trace; c01a5f12 <xfs_bmap_del_extent+58a/f68>
> > Trace; c01d6474 <xlog_state_do_callback+2a4/2ec>
> > Trace; c01fd2e0 <kmem_zone_zalloc+44/d0>
> > Trace; c01aac14 <xfs_bunmapi+b78/fd0>
> > Trace; c01cf962 <xfs_itruncate_finish+23e/3e0>
> > Trace; c01e6e22 <xfs_setattr+ae2/f7c>
> > Trace; c01e6340 <xfs_setattr+0/f7c>
> > Trace; c026cf44 <qdisc_restart+14/178>
> > Trace; c01f696e <linvfs_setattr+152/17c>
> > Trace; c01e6340 <xfs_setattr+0/f7c>
> > Trace; c014f45c <notify_change+7c/2a4>
> > Trace; f8d2e972 <[nfsd]nfsd_setattr+3ea/524>
> > Trace; f8d33f7a <[nfsd]nfsd3_proc_setattr+b6/c4>
> > Trace; f8d3b4a0 <[nfsd]nfsd_procedures3+40/2c0>
> > Trace; f8d2b5d2 <[nfsd]nfsd_dispatch+d2/19a>
> > Trace; f8d3b4a0 <[nfsd]nfsd_procedures3+40/2c0>
> > Trace; f8cf6f88 <[sunrpc]svc_process+28c/51c>
> > Trace; f8d3b400 <[nfsd]nfsd_svcstats+0/40>
> > Trace; f8d3aed8 <[nfsd]nfsd_version3+0/10>
> > Trace; f8d2b348 <[nfsd]nfsd+1b8/370>
> > Trace; c01057ea <kernel_thread+22/30>
> > Code; c012ff76 <kfree+66/14c>
> > 00000000 <_EIP>:
> > Code; c012ff76 <kfree+66/14c> <=====
> > 0: 0f 0b ud2a <=====
> > Code; c012ff78 <kfree+68/14c>
> > 2: 83 c4 08 add $0x8,%esp
> > Code; c012ff7a <kfree+6a/14c>
> > 5: 8b 15 2c 95 3f c0 mov 0xc03f952c,%edx
> > Code; c012ff80 <kfree+70/14c>
> > b: 8b 2c 1a mov (%edx,%ebx,1),%ebp
> > Code; c012ff84 <kfree+74/14c>
> > e: 89 7c 24 14 mov %edi,0x14(%esp,1)
> > Code; c012ff88 <kfree+78/14c>
> > 12: b8 00 00 00 00 mov $0x0,%eax
> >
> >
> > Many thanks for your past (and continued) support. I'll be
> > away from the 9th to the 23rd of June, I'd therefore be
> > grateful if you could copy any replies to my colleague Oz
> > Parchment at 'O.G.Parchment@xxxxxxxxxxx'.
> >
> > Thanks again.
> >
> > Ian Hardy
> >
> > Steve Lord wrote:
> > >
> > > On Thu, 2002-04-11 at 11:17, Ian D. Hardy wrote:
> > > > Steve, +++
> > > >
> > > > Good news! (though posting this is tempting fait a bit). Since
> > > > installing the XFS/CVS containing this fix my server has now
> > > > been up for 21+days, this is a record (it's average was ~4 days
> > > > but it did manage up to 14 days).
> > > >
> > > > I also applied the 'vnode.patch' that you posted in response
> > > > to my problems on 6th March, as far as I'm aware that has not
> > > > gone into the CVS tree? Is this still a valid patch? My
> > > > understanding is that I'd probably seen at least these two bugs
> > > > at various times?
> > >
> > > The vnode code changes should be in the tree as well.
> > >
> > > >
> > > > Many thanks for all your help.
> > >
> > > Thanks for perservering with xfs!
> > >
> > > Steve
> > >
> > > --
> > >
> > > Steve Lord voice: +1-651-683-3511
> > > Principal Engineer, Filesystem Software email: lord@xxxxxxx
> >
> > --
> >
> > /////////////Technical Coordination, Research Services////////////////////
> > Ian Hardy Tel: 023 80 593577
> > Computing Services
> > Southampton University email: idh@xxxxxxxxxxx
> > Southampton S017 1BJ, UK. i.d.hardy@xxxxxxxxxxx
> > \\'BUGS: The notion of errors is ill-defined' (IRIX man page for netstat)\
>
> --
>
> /////////////Technical Coordination, Research Services////////////////////
> Ian Hardy Tel: 023 80 593577
> Computing Services Mobile: 0709 2127503
> Southampton University email: idh@xxxxxxxxxxx
> Southampton S017 1BJ, UK. i.d.hardy@xxxxxxxxxxx
> \\'BUGS: The notion of errors is ill-defined' (IRIX man page for netstat)\
|