devfs
[Top] [All Lists]

Re: [PATCH] fix initlog/minilogd deadlock on /dev/log access

To: Pavel Roskin <proski@xxxxxxx>
Subject: Re: [PATCH] fix initlog/minilogd deadlock on /dev/log access
From: Andrey Borzenkov <arvidjaar@xxxxxxx>
Date: Tue, 6 May 2003 00:47:38 +0400
Cc: devfs@xxxxxxxxxxx
In-reply-to: <Pine.LNX.4.55.0305051420370.1316@marabou.research.att.com>
References: <E19CYOW-00091k-00.arvidjaar-mail-ru@f6.mail.ru> <Pine.LNX.4.55.0305051420370.1316@marabou.research.att.com>
Sender: devfs-bounce@xxxxxxxxxxx
User-agent: KMail/1.5
please, try attached proof-of-concept patch (untested). It is the best I can 
come up with at 0:45 a.m.

On Monday 05 May 2003 22:40, Pavel Roskin wrote:
> On Mon, 5 May 2003, Andrey Borzenkov wrote:
> > > One minilogd calls bind() on /dev/log, and the other calls unlink() on
> > > the same file.  The problem only happens with Linux 2.4.x (not 2.5.x)
> > > and only if devfs is mounted.
>
> Yes.  I could run that system with serial console dump the process list by
> Break-T (the same as Alt-SysRq-T on the local console).  The result is
> available here:
>
> https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=91500&action=view
>
> It is clearly visible that there is a deadlock between sys_unlink() and
> sys_bind(), which confirms my findings based on adding debug information
> to minilogd.
>

Argh! It was not bind/unlink - it was two concurrent lookups on non-existent 
entry that was just removed by one of minilod's.

minilogd1                                                            minilog2

                                  path_lookup("dev/log", LOOKUP_PARENT, &nd);
                                    -> yields dentry for  </dev>

path_lookup("dev/log", LOOKUP_PARENT, &nd);
   -> yields dentry for  </dev>
down(</dev>->i_sem) holds i_sem

                                              down(</dev>->i_sem) - sleeps

lookup_hash("log", </dev>)                                .
devfs_lookup(</dev>, "log")                               .
   MISS                                                   .
set "log"->d_op to &devfs_wait_dops;                      .
init "log"->wait_queue                                    .
up(</dev>->i_sem)                                         .
                                                          .
                                                  obtains i_sem
                                 lookup_hash("log", </dev>);
                                 cached_lookup(</dev>, "log", 0)
                                 devfs_d_revalidate_wait("log", 0)
                                 wait on "log"->wait_queue
                                ... waits to be waked up by devfs_lookup

try_modload("log", bla bla bla)
down(</dev>->i_sem)
                     deadlock

wakeup("log"->wait_queue) happens only here

[...]

I have to think it over. It is not trivial to fix. It appears ->d_revalidate 
is sometimes called under i_sem and sometimes without i_sem. Probably 
rearranging devfs_lookup may do. The whole story sucks.

Do you know who maintains devfs in 2.4 now?

I am not sure the problem does not exist in 2.5. It depends on precise timing 
and scheduling so it may stil be there, just not exposed as yet.

-andrey

thousands thanks. Esp. as the problem magically stopped happen here :(

Attachment: devfs.minilogd.patch
Description: Text Data

<Prev in Thread] Current Thread [Next in Thread>