xfs
[Top] [All Lists]

Re: panic - Attempting to free lock with active block list

To: linux-kernel@xxxxxxxxxxxxxxx, linux-xfs@xxxxxxxxxxx
Subject: Re: panic - Attempting to free lock with active block list
From: Anders Saaby <as@xxxxxxxxxxxx>
Date: Tue, 11 Jan 2005 10:59:20 +0100
Delivered-to: news2mail@xxxxxxxxxxxxxxxxx
Followup-to: cohaesio.lists.linux.kernel
Organization: Cohaesio A/S
References: <20050105195736.GA26989@xxxxxxxxx> <20050105123207.J469@xxxxxxxxxxxxxxxxxx> <1104962043.5717.28.camel@xxxxxxxxxxxxxxxxxx> <20050106151726.GB24796@xxxxxxxxx>
Reply-to: as@xxxxxxxxxxxx
Sender: linux-xfs-bounce@xxxxxxxxxxx
Hi Myklebust(s) :)

I have seen the exact same error on one of my webservers which is serving
from an NFS export and under heavy load. ~2 hours uptime before panic'ing.
I then tried Trond's patch which seems to work. 14 hours of uptime now. :)

Anyways, I have a couple of issues you might be able to clear up for me:

First issue:
New strange message in the kernel log:

"nlmclnt_lock: VFS is out of sync with lock manager!"

- What does this mean? - Is it bad?, What can i do?


Second issue:
my fs/nfs/file.c doesn't look like yours (Vanilla 2.6.10):

<fs/nfs/file.c SNIP>
        status = NFS_PROTO(inode)->lock(filp, cmd, fl);
        /* If we were signalled we still need to ensure that
         * we clean up any state on the server. We therefore
         * record the lock call as having succeeded in order to
         * ensure that locks_remove_posix() cleans it out when
         * the process exits.
         */
        if (status == -EINTR || status == -ERESTARTSYS)
                posix_lock_file_wait(filp, fl);
        unlock_kernel();
        if (status < 0)
                return status;
        /*
         * Make sure we clear the cache whenever we try to get the lock.
         * This makes locking act as a cache coherency point.
         */
        filemap_fdatawrite(filp->f_mapping);
        down(&inode->i_sem);
        nfs_wb_all(inode);      /* we may have slept */
        up(&inode->i_sem);
        filemap_fdatawait(filp->f_mapping);
        nfs_zap_caches(inode);
        return 0;
</SNIP>

So... Am I missing another patch or something else?

Jan-Frode Myklebust wrote:

> On Wed, Jan 05, 2005 at 10:54:03PM +0100, Trond Myklebust wrote:
>> 
>> Looking at the NFS code, I can attempt a wild guess about what may be
>> happening: there may be a race when pressing ^C in the middle of a
>> blocking NFS lock RPC call, and if so, the following patch will fix it.
> 
> 
> A whopping 9 hours of uptime now :) So the one-liner patch seems to have
> fixed it.
> 
> Thanks!
> 
>> -   posix_lock_file(filp, fl);
>> +   posix_lock_file_wait(filp, fl);
> 
> 
>   -jf

-- 
Med venlig hilsen - Best regards - Meilleures salutations

Anders Saaby
Systems Engineer
------------------------------------------------
Cohaesio A/S - Maglebjergvej 5D - DK-2800 Lyngby
Phone: +45 45 880 888 - Fax: +45 45 880 777
Mail: as@xxxxxxxxxxxx - http://www.cohaesio.com
------------------------------------------------


<Prev in Thread] Current Thread [Next in Thread>