[Top] [All Lists]

Re: Timing critical portions of XFS at startup?

To: Steve Lord <lord@xxxxxxx>
Subject: Re: Timing critical portions of XFS at startup?
From: "Ian S. Nelson" <ian.nelson@xxxxxxxxxxxx>
Date: Fri, 30 Nov 2001 13:36:43 -0700
Cc: Eric Sandeen <sandeen@xxxxxxx>, "linux-xfs@xxxxxxxxxxx" <linux-xfs@xxxxxxxxxxx>
Organization: Echostar
References: <3C07BDD9.B71E5600@xxxxxxxxxxxx> <1007140779.16790.11.camel@xxxxxxxxxxxxxxxxxxxxxx> <3C07C251.273312EB@xxxxxxxxxxxx> <3C07D2D5.65A78599@xxxxxxxxxxxx> <1007145351.4099.13.camel@xxxxxxxxxxxxxxxxxxxx> <3C07D542.E13EB0EF@xxxxxxxxxxxx> <1007148290.4099.16.camel@xxxxxxxxxxxxxxxxxxxx>
Reply-to: ian.nelson@xxxxxxxxxxxx
Sender: owner-linux-xfs@xxxxxxxxxxx
If we're out of memory it would fail in xfs_trans_alloc(),  it does a 
to tp and then immediately starts to populate it.

This is looking more like a stack smash or some kind of corruption.   I'm 
trying to get
some more debugging in to gather more information.  I'm thinking it's USB 
related,  it
spins off some threads and I've seen them crash before.


Steve Lord wrote:

> On Fri, 2001-11-30 at 12:51, Ian S. Nelson wrote:
> > That's what I gather.  I disassembled the kernel and it's blowing up when it
> > dereferences *tp in xfs_trans_count_vecs on line 4 of the function.     I 
> > have 32MB
> > of RAM in the box.  Is this a kernel memory allocation failure?  There 
> > aren't
> > really any user mode apps running.
> OK, I think I want to withdraw the memory allocation failure diagnosis,
> xfs_trans_count_vecs does not look at anything which has not been
> referenced a lot prior to this. You state that it dies the first
> time it dereferences tp, which would actually be in this code:
> STATIC uint
> xfs_trans_count_vecs(
>         xfs_trans_t     *tp)
> {
>         int                     nvecs;
>         xfs_log_item_desc_t     *lidp;
>         nvecs = 1;
>         lidp = xfs_trans_first_item(tp);
>                 ^^^^^^^^^^^^^^^^^^^^^^^
>         ASSERT(lidp != NULL);
> This is starting to sound a lot more like a memory corruption than an
> allocation failure. 32M should be ample unless you are doing a lot of
> I/O in parallel.
> Can you decode the oops output - I doubt it will tell us much, but it
> may help.
> Steve

<Prev in Thread] Current Thread [Next in Thread>