On Fri, 2003-08-01 at 10:38, Dale J. Stephenson wrote:
> I was looking at a misbehaving xfs_freeze command in an older version
> of XFS and found myself stuck in the list_for_each_safe() loop in
> pagebuf_delwri_flush. The head had apparently been removed from the
> list in a case of spectacularly poor timing. I found that the list had
> been protected by the pbd_delwrite_lock, but not everywhere.
>
> Checking the current CVS, I still see that the list is still not
> protected everywhere. Examples:
>
> pagebuf_daemon, line 2083, calls list_del_init(&pb->pb_list) without
> holding the lock.
This is a temporary local queue, we moved the buffers off the global
list prior to processing this list. The whole point of the list
is so we can do sleeping things without going near the global
list and its spinlock.
> pagebuf_delwri_flush, line 2182, calls list_del_init(&pb->pb_list)
> without holding the lock.
Also a temporary list.
> pagebuf_delwri_flush, lines 2151-2166, releases the lock inside a
> list_for_each_safe loop.
>
This one is a little fishy, I think it needs to be doing the same
logic as the daemon thread and pulling everything onto the temp
list before it processes them.
> I don't see any downside to putting locks around the list_del_init
> calls. I'm not sure what to do with the last one. If it were changed
> to list_for_each, would it survive the head getting taken away? I'm
> assuming the lock cannot be held through __pagebuf_iorequest() or
> pagebuf_run_queues() calls.
It cannot - it is a spinlock, and we cannot sleep with it.
Thanks
Steve
--
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: lord@xxxxxxx
|