netdev
[Top] [All Lists]

Re: PROBLEM: IProute hangs after running traffic shaping scripts

To: "David S. Miller" <davem@xxxxxxxxxxxxx>
Subject: Re: PROBLEM: IProute hangs after running traffic shaping scripts
From: Patrick McHardy <kaber@xxxxxxxxx>
Date: Wed, 10 Nov 2004 01:55:37 +0100
Cc: tgraf@xxxxxxx, spam@xxxxxxxxxxxxx, netdev@xxxxxxxxxxx
In-reply-to: <41916369.7020901@xxxxxxxxx>
References: <418B4C7C.8000402@xxxxxxxxxxxxx> <418EA032.7050507@xxxxxxxxx> <418ECE85.9090203@xxxxxxxxx> <20041108135431.GE31969@xxxxxxxxxxxxxx> <418F9AD0.1040701@xxxxxxxxx> <20041108183300.GF31969@xxxxxxxxxxxxxx> <418FCD0A.4040202@xxxxxxxxx> <20041109161816.425ad7d6.davem@xxxxxxxxxxxxx> <41916369.7020901@xxxxxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5
Patrick McHardy wrote:

David S. Miller wrote:

How do these child qdiscs get destroyed at all if you just
remove them from the lists they are on?  How will the rest
of destroy processing find them and clean them up?


The RCU-callback calls ops->destroy. The qdisc knows about it's inner
structure and destroys all classes and the inner qdiscs. dev->qdisc_list
is just a flat list containing all qdiscs of the tree for lookups.


This is the final patch:

#   [PKT_SCHED]: Unlink inner qdiscs immediately in qdisc_destroy
#
#   Before the RCU change distruction of the qdisc and all inner
#   qdiscs happend immediately and under the rtnl semaphore. This
#   made sure nothing holding the rtnl semaphore could end up with
#   invalid memory. This is not true anymore, inner qdiscs found on
#   dev->qdisc_list can be suddenly destroyed by the RCU callback.
#   Unlink them immediately when the outer qdisc is destroyed so
#   nothing can find them until they get destroyed.

This also makes semantics sane again, an inner qdiscs should not
be user-visible once the containing qdisc has been destroyed. The
second part (locking in qdisc_lookup) is not really required, but
currently the only purpose of qdisc_tree_lock seems to be to protect
dev->qdisc_list, which is also protected by the rtnl. The rtnl is
especially relied on for making sure nobody frees a qdisc while it
is used in user-context, so qdisc_tree_lock looks unnecessary. I'm
currently reviewing all qdisc locking, if this turns out to be right
I will remove qdisc_tree_lock entirely in a follow-up patch, but for
now I left it in for consistency.

Regards
Patrick

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2004/11/09 07:46:48+01:00 kaber@xxxxxxxxxxxx 
#   [PKT_SCHED]: Unlink inner qdiscs immediately in qdisc_destroy
#   
#   Before the RCU change distruction of the qdisc and all inner
#   qdiscs happend immediately and under the rtnl semaphore. This
#   made sure nothing holding the rtnl semaphore could end up with
#   invalid memory. This is not true anymore, inner qdiscs found on
#   dev->qdisc_list can be suddenly destroyed by the RCU callback.
#   Unlink them immediately when the outer qdisc is destroyed so
#   nothing can find them until they get destroyed.
#   
#   With help from Thomas Graf <tgraf@xxxxxxx>
#   
#   Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>
# 
# net/sched/sch_generic.c
#   2004/11/09 07:46:39+01:00 kaber@xxxxxxxxxxxx +23 -1
#   [PKT_SCHED]: Unlink inner qdiscs immediately in qdisc_destroy
#   
#   Before the RCU change distruction of the qdisc and all inner
#   qdiscs happend immediately and under the rtnl semaphore. This
#   made sure nothing holding the rtnl semaphore could end up with
#   invalid memory. This is not true anymore, inner qdiscs found on
#   dev->qdisc_list can be suddenly destroyed by the RCU callback.
#   Unlink them immediately when the outer qdisc is destroyed so
#   nothing can find them until they get destroyed.
#   
#   With help from Thomas Graf <tgraf@xxxxxxx>
#   
#   Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>
# 
# net/sched/sch_api.c
#   2004/11/09 07:46:39+01:00 kaber@xxxxxxxxxxxx +5 -1
#   [PKT_SCHED]: Unlink inner qdiscs immediately in qdisc_destroy
#   
#   Before the RCU change distruction of the qdisc and all inner
#   qdiscs happend immediately and under the rtnl semaphore. This
#   made sure nothing holding the rtnl semaphore could end up with
#   invalid memory. This is not true anymore, inner qdiscs found on
#   dev->qdisc_list can be suddenly destroyed by the RCU callback.
#   Unlink them immediately when the outer qdisc is destroyed so
#   nothing can find them until they get destroyed.
#   
#   With help from Thomas Graf <tgraf@xxxxxxx>
#   
#   Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>
# 
diff -Nru a/net/sched/sch_api.c b/net/sched/sch_api.c
--- a/net/sched/sch_api.c       2004-11-10 01:45:31 +01:00
+++ b/net/sched/sch_api.c       2004-11-10 01:45:31 +01:00
@@ -196,10 +196,14 @@
 {
        struct Qdisc *q;
 
+       read_lock_bh(&qdisc_tree_lock);
        list_for_each_entry(q, &dev->qdisc_list, list) {
-               if (q->handle == handle)
+               if (q->handle == handle) {
+                       read_unlock_bh(&qdisc_tree_lock);
                        return q;
+               }
        }
+       read_unlock_bh(&qdisc_tree_lock);
        return NULL;
 }
 
diff -Nru a/net/sched/sch_generic.c b/net/sched/sch_generic.c
--- a/net/sched/sch_generic.c   2004-11-10 01:45:31 +01:00
+++ b/net/sched/sch_generic.c   2004-11-10 01:45:31 +01:00
@@ -483,10 +483,32 @@
 
 void qdisc_destroy(struct Qdisc *qdisc)
 {
+       struct list_head cql = LIST_HEAD_INIT(cql);
+       struct Qdisc *cq, *q, *n;
+
        if (qdisc->flags & TCQ_F_BUILTIN ||
                !atomic_dec_and_test(&qdisc->refcnt))
                return;
-       list_del(&qdisc->list);
+
+       if (!list_empty(&qdisc->list)) {
+               if (qdisc->ops->cl_ops == NULL)
+                       list_del(&qdisc->list);
+               else
+                       list_move(&qdisc->list, &cql);
+       }
+
+       /* unlink inner qdiscs from dev->qdisc_list immediately */
+       list_for_each_entry(cq, &cql, list)
+               list_for_each_entry_safe(q, n, &qdisc->dev->qdisc_list, list)
+                       if (TC_H_MAJ(q->parent) == TC_H_MAJ(cq->handle)) {
+                               if (q->ops->cl_ops == NULL)
+                                       list_del_init(&q->list);
+                               else
+                                       list_move_tail(&q->list, &cql);
+                       }
+       list_for_each_entry_safe(cq, n, &cql, list)
+               list_del_init(&cq->list);
+
        call_rcu(&qdisc->q_rcu, __qdisc_destroy);
 }
 
<Prev in Thread] Current Thread [Next in Thread>