xfs
[Top] [All Lists]

BUG: workqueue leaked lock or atomic

To: xfs@xxxxxxxxxxx
Subject: BUG: workqueue leaked lock or atomic
From: Alex Elder <elder@xxxxxxxxxxx>
Date: Tue, 18 Dec 2012 08:25:06 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0
I was running xfstests on a 3.6-derived kernel and injecting
some errors.  At some point a few of these surfaced as I/O
errors, which the generic buffer code complained about.
That's all fine (well, I think).  An example:

  Buffer I/O error on device rbd2, logical block 3072
  Buffer I/O error on device rbd2, logical block 3073
  ...

However, after a string of these, I got this:

  BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554
      last function: xfs_end_io+0x0/0x110 [xfs]

I haven't looked very hard at this yet because I wanted to
see if anyone had some quick info that would avoid me going
off in the wrong direction.

The I/O error messages are generated in two spots (sadly,
identical error messages):

    end_buffer_write_sync()
    end_buffer_async_write()

The workqueue leaked message comes from process_one_work(), so the
xfs_end_io() is being called by the ioend work queue (not from
xfs_finish_ioend_sync()).

So...  I want to report this in case it's not been seen before.
But I'm also trying to figure out whether the problem is likely
to lie in XFS, the generic buffer, code, or in the underlying
block device code.  The latter is (of course) my assumption...
And any useful insights or suggestions how to proceed?

Thanks.

                                        -Alex

<Prev in Thread] Current Thread [Next in Thread>