pagg
[Top] [All Lists]

Re: [patch] Minor PAGG attach/detach semantic change for 2.6.11

To: Erik Jacobson <erikj@xxxxxxx>
Subject: Re: [patch] Minor PAGG attach/detach semantic change for 2.6.11
From: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
Date: Wed, 28 Sep 2005 08:04:55 -0700
Cc: Kaigai Kohei <kaigai@xxxxxxxxxxxxx>, Kingsley Cheung <kingsley@xxxxxxxxxx>, pagg@xxxxxxxxxxx, tonyt@xxxxxxxxxx
In-reply-to: <20050928141831.GA24110@xxxxxxx>
References: <20050617014512.GA10285@xxxxxxxxxx> <20050927201020.GA30433@xxxxxxx> <433A7FE4.5040109@xxxxxxxxxxxxx> <20050928141831.GA24110@xxxxxxx>
Reply-to: paulmck@xxxxxxxxxx
Sender: pagg-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.1i
On Wed, Sep 28, 2005 at 09:18:31AM -0500, Erik Jacobson wrote:
> > In my understanding, any _write_ operations can not be implemented
> > without locking, even if we can use RCU.
> > (In addition, RCU conscious writing/update style is required.)
> 
> I'm surrounding write operations with a writelock rwsem (not a spinlock, at 
> least not now... since it is common for pnotify users to use semaphores for 
> their own locking).  I think this is similar to your example submission, but
> you used a spinlock in those places.

Yes, a semaphore works as well.  Whatever it is, there must be something
to coordinate the updaters.

                                                Thanx, Paul

> At this moment, I'm close to done with something to look at.  I'm just 
> tracking down a bug in the new implementation that showed up in the Job 
> patch.  I'll post what I have when I'm ready and maybe we can tear it apart 
> then.  This is my first rcu experience so I'd welcome the feedback including
> "this just won't work with rcu" if that's what it comes down to.
> 
> I also want to be sure the 'stale data' problem isn't actually a problem
> for us.
> 
> > frequentrly-read pass, such as SELinux's Access Vector Cache(AVC).
> > But it's not omnipotence, and it restricts write methodology.
> 
> The feeling I had is that most users of pnotify won't be writing super-often.
> This is a generalization that may be incorrect.  Taking Job as an example,
> once the process is made part of a job, not much usually happens in terms of
> adjusting the data pointer associated with the task struct until Job is done.
> 
> I could imagine there may be things this isn't the case for, then the writes
> will be a penalty possibly.
> 
> > I have attention to another respect. The current pnotify implementation
> > requires to hold pnotify_event_list_sem before calling pnotify_get_events().
> 
> As I recall, that code only would happen at most twice in the life of a kernel
> module, right?  The only time the init function pointer would fire, if it's
> present, is at pnotify_register time.  A similar piece of code happens at
> unregister time I think.  I guess I'm wondering if this happens enough to
> worry about?  Please let me know if I missed your entire point.
> 
> > Threfore, we must repeat read_lock/unlock(&tasklist_lock) on
> > do_each_thread()/while_each_thread() loop as follows:
> > 
> > ----------------------------
> > read_lock(&tasklist_lock);
> > do_each_thread(g, p) {
> >     get_task_struct(p);
> >     read_unlock(&tasklist_lock);
> > 
> >     down_read(&p->pnotify_subscriber_list_sem);
> >     subscriber = pnotify_get_subscriber(p, events->name);
> >             :
> >     up_read(&p->pnotify_subscriber_list_sem);
> >     read_lock(&tasklist_lock);
> >     << checking, p is dead or not ? >>
> > } while_each_thread(g, p);
> > read_unlock(&tasklist_lock);
> > ----------------------------
> > 
> > I'm happy, if pnotify_subscriber_list would be protected by rwlock.
> > 
> > If rwlock is used, we can not implement pnotify_subscribe() with current
> > spec.  But is it impossible to prepare pnotify_subscribe_atomic() or
> > pnotify_subscribe_bind() which associates task_struct with pre-allocated
> > pnotify_events object ?
> > 
> > ---- in rwlock world :-) ---
> > read_lock(&tasklist_lock);
> > do_each_thread(g, p) {
> >     read_lock(&p->pnotify_subscriber_list_rwlock);
> >     subscriber = pnotify_get_subscriber(p, events->name);
> >             :
> >     read_unlock(&p->pnotify_subscriber_list_rwlock);
> > } while_each_thread(g, p);
> > read_unlock(&tasklist_lock);
> > ----------------------------
> > 
> > Thanks,
> > 
> > 
> > >/**
> > > * __pnotify_fork - Add kernel module subscriber to same subscribers as 
> > > parent
> > > * @to_task: The child task that will inherit the parent's subscribers
> > > * @from_task: The parent task
> > > *
> > > * Used to attach a new task to the same subscribers the parent has in its
> > > * subscriber list.
> > > *
> > > * The "from" argument is the parent task.  The "to" argument is the child
> > > * task.
> > > *
> > > * See Documentation/pnotify.txt for details on
> > > * how to handle return codes from the attach function pointer.
> > > *
> > > * Locking: The to_task is currently in-construction, so we don't
> > > * need to worry about write-locks.  We do need to be sure the parent's
> > > * subscriber list, which we copy here, doesn't go away on us.  This is
> > > * done via RCU.
> > > *
> > > */
> > >int
> > >__pnotify_fork(struct task_struct *to_task, struct task_struct *from_task)
> > >{
> > >   struct pnotify_subscriber *from_subscriber;
> > >   int ret;
> > >
> > >   /* We need to be sure the parent's list we copy from doesn't 
> > >   disappear */
> > >   rcu_read_lock();
> > >
> > >   list_for_each_entry_rcu(from_subscriber, 
> > >   &from_task->pnotify_subscriber_list, entry) {
> > >           struct pnotify_subscriber *to_subscriber = NULL;
> > >
> > >           to_subscriber = pnotify_subscribe(to_task, 
> > >           from_subscriber->events);
> > >           if (!to_subscriber) {
> > >                   ret=-ENOMEM;
> > >                   __pnotify_exit(to_task);
> > >                   rcu_read_unlock();
> > >                   return ret;
> > >           }
> > >           ret = to_subscriber->events->fork(to_task, to_subscriber,
> > >             from_subscriber->data);
> > >   
> > >           rcu_read_unlock(); /* no more to do with the parent's data */
> > 
> > rcu_read_unlovk(); should be deployed outside of the 
> > list_for_each_entry_rcu(){...}.
> > 
> > >
> > >           if (ret < 0) {
> > >                   /* Propagates to copy_process as a fork failure */
> > >                   /* No __pnotify_exit because there is one in the 
> > >                   failure path
> > >                    * for copy_process in fork.c */
> > >                   return ret; /* Fork failure */
> > >           }
> > >           else if (ret > 0) {
> > >                   /* Success, but fork function pointer in the 
> > >                   pnotify_events structure
> > >                    * doesn't want the kenrel module subscribed */
> > >                   /* Again, this is the in-construction-child so no 
> > >                   write lock */
> > >                   pnotify_unsubscribe(to_subscriber);
> > >           }
> > >   }
> > >
> > >   return 0;       /* success */
> > >}
> > 
> > -- 
> > Linux Promotion Center, NEC
> > KaiGai Kohei <kaigai@xxxxxxxxxxxxx>
> --
> Erik Jacobson - Linux System Software - Silicon Graphics - Eagan, Minnesota
> 

<Prev in Thread] Current Thread [Next in Thread>