| To: | Fengguang Wu <fengguang.wu@xxxxxxxxx> |
|---|---|
| Subject: | Re: [PATCH 6/6] workqueue: reimplement WQ_HIGHPRI using a separate worker_pool |
| From: | Tejun Heo <tj@xxxxxxxxxx> |
| Date: | Fri, 13 Jul 2012 20:41:22 -0700 |
| Cc: | linux-kernel@xxxxxxxxxxxxxxx, torvalds@xxxxxxxxxxxxxxxxxxxx, joshhunt00@xxxxxxxxx, axboe@xxxxxxxxx, rni@xxxxxxxxxx, vgoyal@xxxxxxxxxx, vwadekar@xxxxxxxxxx, herbert@xxxxxxxxxxxxxxxxxxxx, davem@xxxxxxxxxxxxx, linux-crypto@xxxxxxxxxxxxxxx, swhiteho@xxxxxxxxxx, bpm@xxxxxxx, elder@xxxxxxxxxx, xfs@xxxxxxxxxxx, marcel@xxxxxxxxxxxx, gustavo@xxxxxxxxxxx, johan.hedberg@xxxxxxxxx, linux-bluetooth@xxxxxxxxxxxxxxx, martin.petersen@xxxxxxxxxx, Tony Luck <tony.luck@xxxxxxxxx> |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=Juo0WPkg9REX5Pa6wd9j6nj6JUE0ghRoiJdrujtenYQ=; b=ByMsHAMewArlneafwbN8CfHDpeeUsI7JJfN9zFXLTtNWs+5bQBZi1rc+9EcJA8BYz5 vflgD260fRJqbBNNP6vOhDdj+s6NoPorDRSEkWjF55Q/Au0XnAi4s43Hv0LDlPpE5RmW o0Jfg+A8YybkUwhbDoH8XtzuyP/dbR5C5WwBLU+9huFNcSfNcOntJht6Bk/T2WRiecyR jOdEnxDE6rjICX60PNQ6yefUFrddv4b/fT2Gq7Wvj/zXDxyDfFhCZJWanMB0/doVU04t 0Ke0CjrxvZOu7vI5fviEb1lHkCBmn4D8KyFVzMS60bj7s8cJi2zx0V46uy2Tsim+ThCb N5ZQ== |
| In-reply-to: | <20120713020800.GA14026@localhost> |
| References: | <1341859315-17759-7-git-send-email-tj@xxxxxxxxxx> <20120712130648.GA19214@localhost> <20120712170519.GA20167@xxxxxxxxxx> <20120712214514.GD20167@xxxxxxxxxx> <20120713020800.GA14026@localhost> |
| Sender: | Tejun Heo <htejun@xxxxxxxxx> |
| User-agent: | Mutt/1.5.20 (2009-06-14) |
Hello,
On Fri, Jul 13, 2012 at 10:08:00AM +0800, Fengguang Wu wrote:
> [ 0.165669] Performance Events: unsupported Netburst CPU model 6 no PMU
> driver, software events only.
> [ 0.167001] XXX cpu=0 gcwq=ffff88000dc0cfc0 base=ffff88000dc11e80
> [ 0.167989] XXX cpu=0 nr_running=0 @ ffff88000dc11e80
> [ 0.168988] XXX cpu=0 nr_running=0 @ ffff88000dc11e88
> [ 0.169988] XXX cpu=1 gcwq=ffff88000dd0cfc0 base=ffff88000dd11e80
> [ 0.170988] XXX cpu=1 nr_running=0 @ ffff88000dd11e80
> [ 0.171987] XXX cpu=1 nr_running=0 @ ffff88000dd11e88
> [ 0.172988] XXX cpu=8 nr_running=0 @ ffffffff81d7c430
> [ 0.173987] XXX cpu=8 nr_running=12 @ ffffffff81d7c438
Heh, I found it. get_pool_nr_running() stores the nr_running array to
use in a local pointer to array and then returns pointer to the
specific element from there depending on the priority.
atomic_t (*nr_running)[NR_WORKER_POOLS];
/* set @nr_running to the array to use */
return nr_running[worker_pool_pri(pool)];
The [] operator in the return statement is indexing to the arrays
instead of the array elements, so if the index is 1, the above
statement offsets nr_running by sizeof(atomic_t [NR_WORKER_POOLS])
instead of sizeof(atomic_t). This should have been
&(*nr_running)[worker_pool_pri(pool)] instead.
So, highpri ends up dereferencing out-of-bounds and depending on
variable layout, it may see garbage value from the beginning (what you
were seeing) or get interfered afterwards (what Tony was seeing).
This also explains why I didn't see it and Tony can no longer
reproduce it after debug patch.
Will post updated patches.
Thank you.
--
tejun
|
| Previous by Date: | [PATCH] xfs: fix comment typo of struct xfs_da_blkinfo., Chen Baozi |
|---|---|
| Next by Date: | Re: [PATCH 5/6] workqueue: introduce NR_WORKER_POOLS and for_each_worker_pool(), Tejun Heo |
| Previous by Thread: | Re: [PATCH 6/6] workqueue: reimplement WQ_HIGHPRI using a separate worker_pool, Fengguang Wu |
| Next by Thread: | [PATCH UPDATED 6/6] workqueue: reimplement WQ_HIGHPRI using a separate worker_pool, Tejun Heo |
| Indexes: | [Date] [Thread] [Top] [All Lists] |