Open Source
Kernel Spinlock Metering for Linux

The Linux SMP kernel uses spinlocks to protect data structures from concurrent, potentially conflicting accesses. This patch allows you to build an i386, ia64, Alpha, Sparc64, or mips64 kernel that can perform simple "metering" (record-keeping) of spinlock usage. Also available is source for an associated new command, lockstat, that is used to instruct the kernel to turn this lock metering on or off, and to retrieve the metering data from the kernel and display it in a human-readable format.

Data displayed includes the number of lock attempts, per-spinlock per-caller, the number of those attempts that were immediately successful vs. those that required the attempting locker to wait for the current lock-holder to release; the mean and max hold-time, and the mean, max, and cumulative wait-time. Whenever possible, the locking caller and the spinlocks are identified by their symbolic names, not by their virtual addresses.

Various patch sets are available.  Version 1.1.4 patches the 2.2.14 kernel and reflects a relatively old flavor of Lockmeter. Version 1.4.11 patches the 2.4.16, 2.4.17, 2.5.3, and 2.5.5 kernels, and the previous release v1.4.9 patches various other releases of the 2.4.x kernel. This version 1.4 supports i386, alpha, ia64, mips64, and sparc64. The most recent version 1.5 is available as a patch against the 2.4.18 and various 2.5.x kernels, and it additionally supports mips (32-bit mips). Each is approximately 22 KB in gzip'ed size. (Patches against a few older kernel versions are also available in the old subdirectory.) After applying the appropriate patch, make oldconfig presents a new Kernel lock metering option in the Kernel hacking subsection -- although only if CONFIG_SMP (Symmetric multi-processing support) has been enabled. The spinlock metering code is compiled into the kernel only when this new option is turned on.

Compiling the spinlock metering code into the kernel does not materially affect the kernel size because the additional code is roughly compensated for by the shrinking effect of the normally in-line locking routines now becoming procedure calls. A metering-capable kernel (i.e., with the patch applied, but data collection turned off) is negligibly slower than a non-metering-capable kernel, though a metering-capable kernel does slow when the metering data collection is turned on using the lockstat command (typically 8% for a systime==25% workload). Care has been taken to minimize performance degradation, and further improvements are in progress.

The lockstat command must also be downloaded, compiled, and installed. lockstat is a privileged command that requires root access. It reads and writes to the node /proc/lockmeter to control the kernel's metering as follows:
    lockstat on enables the kernel's metering data collection,
    lockstat options displays the collected data, and
    lockstat off disables the metering data collection.
Run lockstat with no arguments to see a verbose description of the command arguments and options.

When metering is enabled, count and time data is collected in malloc'ed arrays that are private to each CPU, thereby avoiding costly cacheblock coherency operations that would otherwise be required if all CPUs updated the same count and time fields. The lockstat command accumulates and sorts the per-cpu data at display time.

Lockmetering attempts to provide both "cause" and "effect" information about spinlock usage. The "hold time" metering exposes which spinlocks are being held and for how long, identified by where they are held inside the kernel. The "wait-time" metering exposes the effects of these hold-times when multiple CPUs concurrently contend for the same lock.

A longer description of lockmetering can be found in a paper co-authored by Ray Bryant (IBM) and John Hawkes (SGI), to appear in the Proceedings of the 4th Annual Atlanta Linux Showcase & Conference, October 2000.  This paper, "Lockmeter: Highly-Informative Instrumentation for Spin Locks in the Linux Kernel", is available in both  Postscript and HTML. The conference presentation is also available as a slideshow.

Additionally, Rick Lindsley of IBM ( has contributed a perl script that he calls locksort, which allows you to sort lockmeter output using the numeric values of a specified field. A typical use of this script is to sort lockmeter output on Contention or Utilization, to make it quicker to identify the hot locks.