devfs
[Top] [All Lists]

[PATCH][2.5.74] devfs lookup deadlock/stack corruption combined patch

To: Andrew Morton <akpm@xxxxxxxx>
Subject: [PATCH][2.5.74] devfs lookup deadlock/stack corruption combined patch
From: Andrey Borzenkov <arvidjaar@xxxxxxx>
Date: Mon, 7 Jul 2003 23:06:15 +0400
Cc: linux-kernel@xxxxxxxxxxxxxxx, devfs@xxxxxxxxxxx
In-reply-to: <20030706175405.518f680d.akpm@xxxxxxxx>
References: <E198K0q-000Am8-00.arvidjaar-mail-ru@xxxxxxxxxxx> <20030706120315.261732bb.akpm@xxxxxxxx> <20030706175405.518f680d.akpm@xxxxxxxx>
Sender: devfs-bounce@xxxxxxxxxxx
User-agent: KMail/1.5
On Monday 07 July 2003 04:54, you wrote:
> Actually, don't bother.  This idea can be made to work, but
> we already have enough tricky stuff in the wait/wakeup area.
>
> Let's run with your original patch.
>

I finally hit a painfully trivial way to reproduce another long standing devfs 
problem - deadlock between devfs_lookup and devfs_d_revalidate_wait. When 
devfs_lookup releases directory i_sem devfs_d_revalidate_wait grabs it (it 
happens not for every path) and goes to wait to be waked up. Unfortunately, 
devfs_lookup attempts to acquire directory i_sem before ever waking it up ...

To reproduce (2.5.74 UP or SMP - does not matter, single CPU system)

ls /dev/foo & rm -f /dev/foo &

or possibly in a loop but then it easily fills up process table. In my case it 
hangs 100% reliably - on 2.5 OR 2.4.

The current fix is to move re-acquire of i_sem after all 
devfs_d_revalidate_wait waiters have been waked up. Much better fix would be 
to ensure that ->d_revalidate either is always called under i_sem or always 
without. But that means the very heart of VFS and I do not dare to touch it.

The fix has been tested on 2.4 (and is part of unofficial Mandrake Club 
kernel); I expected the same bug is in 2.5; I just was stupid not seeing the 
way to reproduce it before.

Attached is combined patch and fix for deadlock only (to show it alone). 
Andrew, I slightly polished original stack corruption version to look more 
consistent with the rest of devfs; also removed NULL pointer checks - let it 
just BUG in this case if it happens.

I have already sent the patch for 2.4 two times - please, could somebody 
finally either apply it or explain what is wrong with it. Richard is out of 
reach apparently and the bug is real and seen by many people.

regards

-andrey

Attachment: 2.5.74-devfs_combined.patch
Description: Text Data

Attachment: 2.5.74-devfs_lookup_deadlock.patch
Description: Text Data

<Prev in Thread] Current Thread [Next in Thread>