netdev
[Top] [All Lists]

Re: [PATCH] PKT_SCHED: Initialize list field in dummy qdiscs

To: Thomas Graf <tgraf@xxxxxxx>
Subject: Re: [PATCH] PKT_SCHED: Initialize list field in dummy qdiscs
From: Patrick McHardy <kaber@xxxxxxxxx>
Date: Fri, 05 Nov 2004 18:26:42 +0100
Cc: davem@xxxxxxxxxxxxx, netdev@xxxxxxxxxxx, spam@xxxxxxxxxxxxx, kuznet@xxxxxxxxxxxxx, jmorris@xxxxxxxxxx
In-reply-to: <20041105163951.GY12289@postel.suug.ch>
References: <418B4C7C.8000402@crocom.com.pl> <20041105115430.GP19714@rei.reeler.org> <418B4C7C.8000402@crocom.com.pl> <20041105141640.GQ19714@rei.reeler.org> <418BA66A.60804@trash.net> <20041105163951.GY12289@postel.suug.ch>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5
Thomas Graf wrote:

* Patrick McHardy <418BA66A.60804@xxxxxxxxx> 2004-11-05 17:12


Nice catch. I can't understand how you triggered that oops,
though. The noop- and noqueue-qdiscs used without qdisc_create_*
are not refcounted, so I would expect:

void qdisc_destroy(struct Qdisc *qdisc)
{
      if (!atomic_dec_and_test(&qdisc->refcnt))
              return;

to underflow and return until refcnt finally reaches 0 again.
Can you explain, please ?



Right, this is indeed the case. This doesn't fix the oops reported
but will prevent the oops you are referring to which was triggerd
after 2h stress testing on my machine. I haven't had the time to
check if incrementing the refcnt of builtin qdiscs causes
any problems but initializing the list is good in any case.


Either refcnt them or add add some kind of flag to qdiscs created
by qdisc_create/qdisc_create_default and check for that flag.
Initializing the lists doesn't fix all problems, directly using
noop/noqueue doesn't increment the device refcnt, so is must not
be dropped it __qdisc_destroy.

I found huge locking problems from qidsc_destroy calling contexts though.
Almost all calling paths to qdisc_destroy invoked from qdiscs are messed up.
I am resolving those now. I have a theoretical path that could cause the
reported oops which is htb_put -> htb_destroy_class -> qdisc_destroy
not bh locking dev->queue_lock and thus the list unlinking could
take place during a walk/lookup and thus lead to POISON value in the
next pointer. I could not reproduce this so far though.


ops->put seems to be safe even without holding dev->queue_lock.
The class refcnt is only changed from userspace, and always under
the rtnl semaphore. get/put are always balanced, so pratically a
class can never get destroyed by put.

Regards
Patrick


<Prev in Thread] Current Thread [Next in Thread>