xfs
[Top] [All Lists]

Re: deadlock on vmap_area_lock

To: Shawn Bohrer <sbohrer@xxxxxxxxxxxxxxx>
Subject: Re: deadlock on vmap_area_lock
From: David Rientjes <rientjes@xxxxxxxxxx>
Date: Wed, 1 May 2013 08:57:38 -0700 (PDT)
Cc: xfs@xxxxxxxxxxx, linux-mm@xxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:date:from:x-x-sender:to:cc:subject:in-reply-to :message-id:references:user-agent:mime-version:content-type; bh=inMMfCfbPE0vfF/WAW6o3/mzFDP77NUJ/tRyJ4+W7qg=; b=YdpU8nQaXAHioB3zEwl6ixzZndp+MlkTe3/59LP5fei4HsltbuRTlsjzYJncD0i0Y0 hpkwc09K5heuz6ONzAjABkhAKDk9Ctu6PdSx8/iU3akkbGnr6bp2RPfh2rlpPBWPwir8 ebR5M9yIww1Vm9O/AKzeHz/NHyqJqJLH8Z2qgPOmqcXkevTZow5irby/eyRxTNpXCYT2 fxY99skm6Pc7sH8EALgoO1y2LIjVcp9EdemMTxjxkm6MUZM7C7hJ13/hHZaAK01fQuQA +44ORRUdU+kBENDAQB+ilm6vm1X3KLUmHzpQoguYksgmJP/gQH6lSBEkFxxTFMUHxlvk MuHw==
In-reply-to: <20130501144341.GA2404@xxxxxxxxxxxxxxxxxxxxxxxxx>
References: <20130501144341.GA2404@xxxxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)
On Wed, 1 May 2013, Shawn Bohrer wrote:

> I've got two compute clusters with around 350 machines each which are
> running kernels based off of 3.1.9 (Yes I realize this is ancient by
> todays standards).  All of the machines run a 'find' command once an
> hour on one of the mounted XFS filesystems.  Occasionally these find
> commands get stuck requiring a reboot of the system.  I took a peek
> today and see this with perf:
> 
>     72.22%          find  [kernel.kallsyms]          [k] _raw_spin_lock
>                     |
>                     --- _raw_spin_lock
>                        |          
>                        |--98.84%-- vm_map_ram
>                        |          _xfs_buf_map_pages
>                        |          xfs_buf_get
>                        |          xfs_buf_read
>                        |          xfs_trans_read_buf
>                        |          xfs_da_do_buf
>                        |          xfs_da_read_buf
>                        |          xfs_dir2_block_getdents
>                        |          xfs_readdir
>                        |          xfs_file_readdir
>                        |          vfs_readdir
>                        |          sys_getdents
>                        |          system_call_fastpath
>                        |          __getdents64
>                        |          
>                        |--1.12%-- _xfs_buf_map_pages
>                        |          xfs_buf_get
>                        |          xfs_buf_read
>                        |          xfs_trans_read_buf
>                        |          xfs_da_do_buf
>                        |          xfs_da_read_buf
>                        |          xfs_dir2_block_getdents
>                        |          xfs_readdir
>                        |          xfs_file_readdir
>                        |          vfs_readdir
>                        |          sys_getdents
>                        |          system_call_fastpath
>                        |          __getdents64
>                         --0.04%-- [...]
> 
> Looking at the code my best guess is that we are spinning on
> vmap_area_lock, but I could be wrong.  This is the only process
> spinning on the machine so I'm assuming either another process has
> blocked while holding the lock, or perhaps this find process has tried
> to acquire the vmap_area_lock twice?
> 

Significant spinlock contention doesn't necessarily mean that there's a 
deadlock, but it also doesn't mean the opposite.  Depending on your 
definition of "occassionally", would it be possible to run with 
CONFIG_PROVE_LOCKING and CONFIG_LOCKDEP to see if it uncovers any real 
deadlock potential?

<Prev in Thread] Current Thread [Next in Thread>