xfs
[Top] [All Lists]

Re: BUG: soft lockup detected on CPU#5 / xfsdatad

To: linux-xfs@xxxxxxxxxxx
Subject: Re: BUG: soft lockup detected on CPU#5 / xfsdatad
From: Christian Røsnes <christian.rosnes@xxxxxxxxx>
Date: Mon, 10 Apr 2006 15:47:11 +0200
Cc: David Chinner <dgc@xxxxxxx>
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding; b=q4uPLx1o5Cxmfq622iIu+sRVEIYVfFW6bWvaEQXqrZkXvkXrAV5KrQ1A2EnJzOlGaPEypdrMEdb2QeXphpvQaKrUSbvZ448mxXvzQsNH3OvzQqpK0zZLorME+toEnxvwWtcx3+t3RJ68sGDxZfuOt3iC/T+Beg7ysFl8gretyHI=
In-reply-to: <20060410014701.GJ2732@xxxxxxxxxxxxxxxxx>
References: <443916EC.10809@xxxxxxxxx> <20060410014701.GJ2732@xxxxxxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird 1.5 (X11/20051201)
David Chinner wrote:
FWIW:

=====================================================
Running: cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 226 0 0 0 0 277717661 0 0 IO-APIC-edge timer
  1:          0          0          0          0          0          9          
0          0    IO-APIC-edge  i8042
  9:          0          0          0          0          0          0          
0          0   IO-APIC-level  acpi
 14:          0          0          0          0          0         13          
0          0    IO-APIC-edge  ide0
169:          0          0          0          0          0    5649134          
0          0   IO-APIC-level  uhci_hcd:usb1, qla2400
185:          0          0          0          0          0  545423724          
0          0   IO-APIC-level  eth0
201:          0          0          0          0          0  215363073          
0          0   IO-APIC-level  megaraid
209:          0          0          0          0          0         63          
0          0   IO-APIC-level  ide2, ehci_hcd:usb4
217:          0          0          0          0          0          0          
0          0   IO-APIC-level  uhci_hcd:usb2
225:          0          0          0          0          0          0          
0          0   IO-APIC-level  uhci_hcd:usb3
NMI: 62452 249384 63225 62699 62361 249362 63203 62677 LOC: 277687412 277301786 277700824 277701522 277687319 277301693 277700731 277701429 ERR: 7
MIS:          0

CPU 5 is doing all your interrupt work for timers, usb, ethernet and
your RAID. You might want to try to spread these interrupts to
different CPus to reduce the interrupt load on this one CPU. That
may improve the situation.

Cheers,

Dave.

Thank you for the information.

Is there a preferred way of distributing interrupts across several Xeon cpus ?

Also, does anyone know if Intel Xeons behave differently than AMD Opterons in this regard
(interrupt cpu distibution on 64-bit kernels) ?

Reason for my asking is that as opposed to the interrupt distribution shown in the table above, where most interrupts on that 2-cpu dual-core hyperthreaded Xeon is handled by cpu 5,
my 2-cpu dual-core Opteron rig (kernel 2.6.15, 64-bit) show a more evenly
distribution of interrupts across all its cpus:

opteron# cat /proc/interrupts
         CPU0       CPU1       CPU2       CPU3
0:   33072714   38007277   38475123   37457443    IO-APIC-edge  timer
1:      29837      43445      34925      36416    IO-APIC-edge  i8042
8:  265978968  264597467  333282019  333292177    IO-APIC-edge  rtc
9:          0          0          0          0   IO-APIC-level  acpi
12:     525534     951163     698370     667647    IO-APIC-edge  i8042
14:     982569    1558280    1241299    1200781    IO-APIC-edge  ide0
177: 4015581 6102627 5084144 5415943 IO-APIC-level libata, ehci_hcd:usb2 185: 5141100 8979803 6598927 6391688 IO-APIC-level libata, NVidia CK804
193:     134874     261894     280336     234794   IO-APIC-level  megaraid
201:          0          0          0          3   IO-APIC-level  ohci1394
209: 0 0 0 0 IO-APIC-level ohci_hcd:usb1
217:  192852160          1          3        810   IO-APIC-level  eth0
225:    6879196   11640568    8902256    8644706   IO-APIC-level  nvidia
233:     573425     874257     849441    1650900   IO-APIC-level  arcmsr
NMI:        113         61         79         62
LOC:  147019199  147019177  147019155  147019113
ERR:          0
MIS:          0

I would think that the Xeon could automatically distribute interrupts across
all cpus similar to the Opteron. Maybe some specific kernel-compile-option- or
proc-parameter setting is necessary on the Xeon for this to happen ?

Thanks
Christian


<Prev in Thread] Current Thread [Next in Thread>