pagg
[Top] [All Lists]

PAGG Threading Issues (was Re: [patch] Registration Check for Compulsory

To: Erik Jacobson <erikj@xxxxxxxxxxxxxxxxxxxxxxx>
Subject: PAGG Threading Issues (was Re: [patch] Registration Check for Compulsory Hooks)
From: Kingsley Cheung <kingsley@xxxxxxxxxx>
Date: Tue, 9 Nov 2004 10:00:39 +1100
Cc: pagg@xxxxxxxxxxx
In-reply-to: <Pine.SGI.4.53.0411072233350.2612495@xxxxxxxxxxxxxxxxxxxxxxx>
References: <20041108033648.GB18308@xxxxxxxxxx> <Pine.SGI.4.53.0411072233350.2612495@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: pagg-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.1i
Hi Erik,

Some other issues as well.  Please correct me if I'm wrong :)

1) I've noticed that pagg_detach() has been moved from
__put_task_struct() to do_exit() between the pagg patches for Linux
2.6.8.  This occurred as described at
http://oss.sgi.com/archives/pagg/2004-08/msg00002.html

Anyway, I think the move might have raised a problem with the
registration code.  Below is the relevant segment of code I'd like to
consider:

        get_task_struct(p);
        read_unlock(&tasklist_lock);
        down_write(&p->pagg_sem);
        paggp = pagg_get(p, pagg_hook_new->name);
        if (paggp == NULL) {
                paggp = pagg_alloc(p, pagg_hook_new);
                if (paggp != NULL)
                        init_result = pagg_hook_new->init(p, paggp);
                else
                        init_result = -ENOMEM;
        }
        up_write(&p->pagg_sem);
        read_lock(&tasklist_lock);
        /* Like in remove_client_paggs_from_all_tasks, if the task
         * disappeared on us while we were going through the
         * for_each_process loop, we need to start over with that loop.
         * That's why we have the list_empty here */
        task_exited = list_empty(&p->tasks);
        put_task_struct(p);

Now the tasks on the task list include those that are zombies.  This
occurs during exit_notify(), which sets their state to ZOMBIE but
leaves them on the task list.  Therefore the above code will allocate
paggs for zombie tasks as well. [Note that task_exited won't trigger
as the task is still on the task list.]

Now unfortunately, by moving pagg_detach() out of __put_task_struct(),
when the zombie has its task structure released, the pagg that was
allocated during registration is no longer detached.  Thus we'll could
have a memory leak once the task structure is dropped and
detach_pagg() is not invoked.

In fact the leak could happen not just for zombies, but for tasks just
about to exit.  The race occurs in do_exit(), in which the
detach_pagg() could be called before we allocate the pagg.

Perhaps the quickest solution is to place a detach_pagg() call back
into __put_task_struct().  Another suggestion might be to check for
the PF_EXITING flag when allocating the pagg.  


2) Is pagg intended to be a container for Linux tasks or thread
groups?  I'm avoiding the use of the word "process" as that could be
confusing in this context.

The reason I'm asking is because throughout the pagg registration code
the for_each_process() macro is used.  This happens in
remove_client_paggs_from_all_tasks() and pagg_hook_register().  The
problem is, however, since 2.6 for_each_process() only traverses
thread group leader tasks.  Thus the registration and deregistration
code only considers thread group leaders.

In contrast the pagg_attach() and pagg_detach() hooks are placed to
catch all tasks in the fork and exit code.  With the way the
registration and deregistration code is behaving, this means that we
won't catch all existing tasks and we won't clean up after all tasks
properly for multi-threaded applications.

So what is pagg intended for?  To catch every single task?  In that
case the do_each_thread() and while_each_thread() macros should be
used.  Or is pagg a container for thread groups only?  The changes for
that would require much more thought...


3) Is the "module" pointer used anywhere in the pagg_hook struct meant
to be used in the pagg code or in the client code?  It doesn't seem to
be used in the pagg code at all.


Thanks!
-- 
                Kingsley

<Prev in Thread] Current Thread [Next in Thread>