From owner-lockmeter@oss.sgi.com Tue Sep 12 11:19:22 2000 Received: by oss.sgi.com id ; Tue, 12 Sep 2000 11:19:12 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:44141 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 12 Sep 2000 11:18:55 -0700 Received: from babylon.engr.sgi.com (babylon.engr.sgi.com [163.154.10.144]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id LAA04536 for ; Tue, 12 Sep 2000 11:11:14 -0700 (PDT) mail_from (hawkes@babylon.engr.sgi.com) Received: (from hawkes@localhost) by babylon.engr.sgi.com (SGI-8.9.3/8.9.3) id LAA91081 for lockmeter@oss.sgi.com; Tue, 12 Sep 2000 11:17:38 -0700 (PDT) Date: Tue, 12 Sep 2000 11:17:38 -0700 (PDT) From: John Hawkes Message-Id: <200009121817.LAA91081@babylon.engr.sgi.com> To: lockmeter@oss.sgi.com Subject: lockmeter/lockstat version 1.4 available Sender: owner-lockmeter@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lockmeter-outgoing Linux Lockmeter/Lockstat version 1.4 is now available, supporting i386 and mips64 architectures. Unfortunately, the Alpha support is currently broken (hopefully temporarily). The patchset is against kernel version 2.4.0-test7. See http://oss.sgi.com/projects/lockmeter for more information. John Hawkes (hawkes@engr.sgi.com From owner-lockmeter@oss.sgi.com Fri Sep 22 16:00:05 2000 Received: by oss.sgi.com id ; Fri, 22 Sep 2000 15:59:55 -0700 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:60454 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Fri, 22 Sep 2000 15:59:41 -0700 Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id QAA00429 for ; Fri, 22 Sep 2000 16:06:31 -0700 (PDT) mail_from (hawkes@cthulhu.engr.sgi.com) Received: from pchawkes (sshgate.corp.sgi.com [169.238.216.146]) by cthulhu.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id PAA06906 for ; Fri, 22 Sep 2000 15:59:25 -0700 (PDT) mail_from (hawkes@engr.sgi.com) Message-ID: <014301c024e8$8dfdb180$6401a8c0@marin1.sfba.home.com> From: "John Hawkes" To: Subject: Lockmeter version 1.4.2 adds ia64 support Date: Fri, 22 Sep 2000 15:57:59 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Sender: owner-lockmeter@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lockmeter-outgoing Lockmeter version 1.4.2 is now available at http://oss.sgi.com/projects/lockmeter as a patch against the 2.4.0-test7 kernel. It adds tentative support for the ia64 architecture (after the large http://www.kernel.org/pub/linux/ports/ia64/linux-2.4.0-test7-ia64/000823 .diff.gz patch has been applied to 2.4.0-test7), in addition to the existing support of the i386 and mips64 architectures. The Alpha support is probably still regretably broken (and may likely not even compile). If the lockmeter-1.4.2 patch is applied against unadulterated 2.4.0-test7 kernel sources (i.e., without first applying that large ia64 patch), then the patch fails against include/asm-ia64/spinlock.h. This failure can be ignored for the non-ia64 architectures. (If this patch failure upsets too many people, then I could break the lockmeter-1.4.2 patch into two parts, although I prefer to keep it a single unified patchfile). I term the support for ia64 to be "tentative" in the sense that I have not to date tested the resulting ia64 SMP kernel against SMP hardware (although I would be surprised if it didn't work), and there is still a small timing window in the lockmetering rwlock code that may cause slightly inaccurate statistics for read locks. The kernel functionality is nonetheless correct. John Hawkes hawkes@engr.sgi.com From owner-lockmeter@oss.sgi.com Wed Sep 27 08:40:03 2000 Received: by oss.sgi.com id ; Wed, 27 Sep 2000 08:39:43 -0700 Received: from zmamail01.zma.compaq.com ([161.114.64.101]:3345 "HELO zmamail01.zma.compaq.com") by oss.sgi.com with SMTP id ; Wed, 27 Sep 2000 08:39:37 -0700 Received: by zmamail01.zma.compaq.com (Postfix, from userid 12345) id 2A90370AF; Wed, 27 Sep 2000 11:39:31 -0400 (EDT) Received: from oflume.zk3.dec.com (bryflume.zk3.dec.com [16.141.40.17]) by zmamail01.zma.compaq.com (Postfix) with ESMTP id ACEA17107; Wed, 27 Sep 2000 11:39:18 -0400 (EDT) Received: from wavy.zk3.dec.com by oflume.zk3.dec.com (8.8.8/1.1.22.3/03Mar00-0551AM) id LAA0000014844; Wed, 27 Sep 2000 11:39:14 -0400 (EDT) Received: from zk3.dec.com by wavy.zk3.dec.com (8.9.3/1.1.27.5/31Jan00-1243PM) id LAA0001918753; Wed, 27 Sep 2000 11:38:37 -0400 (EDT) Message-ID: <39D2147B.42BADA99@zk3.dec.com> Date: Wed, 27 Sep 2000 11:38:36 -0400 From: Peter Rival Organization: Tru64 QMG Performance Engineering X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.15pre15ext3 alpha) X-Accept-Language: en MIME-Version: 1.0 To: John Hawkes , lockmeter@oss.sgi.com Cc: ezolt@perf.zko.dec.com Subject: Re: Lockmeter 1.4.2 for i386, ia64, mips64, and Alpha References: <014e01c024e9$105bb780$6401a8c0@marin1.sfba.home.com> Content-Type: multipart/mixed; boundary="------------018661F9373187292F74591D" Sender: owner-lockmeter@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lockmeter-outgoing This is a multi-part message in MIME format. --------------018661F9373187292F74591D Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I'm attaching a diff that gets everything to work on my 8 CPU Wildfire^WAlpha GS system (sans some things I may not have noticed in initial runs). This is _just_ the Alpha-specific stuff, and it's against a clean 2.4.0-test8 tree. I'll send out a diff relative to the 1.4.2 patch Real Soon Now (TM). Just remove any changes in the original 1.4.2 to any of the files listed in this diff and apply it (hopefully that'll work ;). And yes, this finally stops adding that stupid change in pgtable.h. :) - Pete John Hawkes wrote: > Lockmeter version 1.4.2 is now available at > http://oss.sgi.com/projects/lockmeter as a patch against the 2.4.0-test7 > kernel. It adds tentative support for the ia64 architecture (after the > large > http://www.kernel.org/pub/linux/ports/ia64/linux-2.4.0-test7-ia64/000823 > .diff.gz patch has been applied to 2.4.0-test7), in addition to the > existing support of the i386 and mips64 architectures. The Alpha > support is probably still regretably broken (and may likely not even > compile). > > If the lockmeter-1.4.2 patch is applied against unadulterated > 2.4.0-test7 kernel sources (i.e., without first applying that large ia64 > patch), then the patch fails against include/asm-ia64/spinlock.h. This > failure can be ignored for the non-ia64 architectures. (If this patch > failure upsets too many people, then I could break the lockmeter-1.4.2 > patch into two parts, although I prefer to keep it a single unified > patchfile). > > I term the support for ia64 to be "tentative" in the sense that I have > not to date tested the resulting ia64 SMP kernel against SMP hardware > (although I would be surprised if it didn't work), and there is still a > small timing window in the lockmetering rwlock code that may cause > slightly inaccurate statistics for read locks. The kernel functionality > is nonetheless correct. > > John Hawkes > hawkes@engr.sgi.com --------------018661F9373187292F74591D Content-Type: text/plain; charset=us-ascii; name="diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="diff" diff -Naur linux-base/arch/alpha/config.in linux-lockmeter/arch/alpha/config.in --- linux-base/arch/alpha/config.in Wed Sep 27 15:12:19 2000 +++ linux-lockmeter/arch/alpha/config.in Mon Sep 25 18:21:44 2000 @@ -350,4 +350,8 @@ bool 'Legacy kernel start address' CONFIG_ALPHA_LEGACY_START_ADDRESS +if [ "$CONFIG_SMP" = "y" ]; then + bool 'Kernel Lock Metering' CONFIG_LOCKMETER +fi + endmenu diff -Naur linux-base/include/asm-alpha/lockmeter.h linux-lockmeter/include/asm-alpha/lockmeter.h --- linux-base/include/asm-alpha/lockmeter.h Wed Dec 31 19:00:00 1969 +++ linux-lockmeter/include/asm-alpha/lockmeter.h Wed Sep 27 13:51:42 2000 @@ -0,0 +1,90 @@ +/* + * Written by John Hawkes (hawkes@sgi.com) + * Based on klstat.h by Jack Steiner (steiner@sgi.com) + * + * Modified by Peter Rival (frival@zk3.dec.com) + */ + +#ifndef _ALPHA_LOCKMETER_H +#define _ALPHA_LOCKMETER_H + +#include +#define CPU_CYCLE_FREQUENCY hwrpb->cycle_freq + +#define get_cycles64() get_cycles() + +#include +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,3,0) +#define local_irq_save(x) \ + __save_and_cli(x) +#define local_irq_restore(x) \ + __restore_flags(x) +#endif /* Linux version 2.2.x */ + +#define SPINLOCK_MAGIC_INIT /**/ + +/* + * Macros to cache and retrieve an index value inside of a lock + * these macros assume that there are less than 65536 simultaneous + * (read mode) holders of a rwlock. + * We also assume that the hash table has less than 32767 entries. + * the high order bit is used for write locking a rw_lock + * Note: although these defines and macros are the same as what is being used + * in include/asm-i386/lockmeter.h, they are present here to easily + * allow an alternate Alpha implementation. + */ +/* + * instrumented spinlock structure -- never used to allocate storage + * only used in macros below to overlay a spinlock_t + */ +typedef struct inst_spinlock_s { + /* remember, Alpha is little endian */ + unsigned short lock; + unsigned short index; +} inst_spinlock_t; +#define PUT_INDEX(lock_ptr,indexv) ((inst_spinlock_t *)(lock_ptr))->index = indexv +#define GET_INDEX(lock_ptr) ((inst_spinlock_t *)(lock_ptr))->index + +/* + * macros to cache and retrieve an index value in a read/write lock + * as well as the cpu where a reader busy period started + * we use the 2nd word (the debug word) for this, so require the + * debug word to be present + */ +/* + * instrumented rwlock structure -- never used to allocate storage + * only used in macros below to overlay a rwlock_t + */ +typedef struct inst_rwlock_s { + volatile int lock; + unsigned short index; + unsigned short cpu; +} inst_rwlock_t; +#define PUT_RWINDEX(rwlock_ptr,indexv) ((inst_rwlock_t *)(rwlock_ptr))->index = indexv +#define GET_RWINDEX(rwlock_ptr) ((inst_rwlock_t *)(rwlock_ptr))->index +#define PUT_RW_CPU(rwlock_ptr,cpuv) ((inst_rwlock_t *)(rwlock_ptr))->cpu = cpuv +#define GET_RW_CPU(rwlock_ptr) ((inst_rwlock_t *)(rwlock_ptr))->cpu + +/* + * return true if rwlock is write locked + * (note that other lock attempts can cause the lock value to be negative) + */ +#define RWLOCK_IS_WRITE_LOCKED(rwlock_ptr) (((inst_rwlock_t *)rwlock_ptr)->lock & 1) +#define IABS(x) ((x) > 0 ? (x) : -(x)) + +#define RWLOCK_READERS(rwlock_ptr) rwlock_readers(rwlock_ptr) +extern inline int rwlock_readers(rwlock_t *rwlock_ptr) +{ + int tmp = (int) ((inst_rwlock_t *)rwlock_ptr)->lock; + /* readers subtract 2, so we have to: */ + /* - andnot off a possible writer (bit 0) */ + /* - get the absolute value */ + /* - divide by 2 (right shift by one) */ + /* to find the number of readers */ + if (tmp == 0) return(0); + else return(IABS(tmp & ~1)>>1); +} + + +#endif /* _ALPHA_LOCKMETER_H */ + diff -Naur linux-base/include/asm-alpha/spinlock.h linux-lockmeter/include/asm-alpha/spinlock.h --- linux-base/include/asm-alpha/spinlock.h Fri Feb 25 01:36:05 2000 +++ linux-lockmeter/include/asm-alpha/spinlock.h Wed Sep 27 14:43:13 2000 @@ -5,8 +5,8 @@ #include #include -#define DEBUG_SPINLOCK 1 -#define DEBUG_RWLOCK 1 +#define DEBUG_SPINLOCK 0 +#define DEBUG_RWLOCK 0 /* * Simple spin lock operations. There are two variants, one clears IRQ's @@ -58,6 +58,7 @@ (LOCK)->lock ? "taken" : "freed", (LOCK)->on_cpu); \ } while (0) #else +#ifndef CONFIG_LOCKMETER static inline void spin_unlock(spinlock_t * lock) { mb(); @@ -89,20 +90,88 @@ #define spin_trylock(lock) (!test_and_set_bit(0,(lock))) #define spin_lock_own(LOCK, LOCATION) ((void)0) -#endif /* DEBUG_SPINLOCK */ + +#else /* CONFIG_LOCKMETER */ +extern void _spin_lock_(spinlock_t *lock_ptr); +# define spin_lock(lock) _spin_lock_(lock) +static inline void nonmetered_spin_lock(spinlock_t *lock) { + long tmp; + /* Use sub-sections to put the actual loop at the end + of this object file's text section so as to perfect + branch prediction. */ + __asm__ __volatile__( + "1: ldl_l %0,%1\n" + " blbs %0,2f\n" + " or %0,1,%0\n" + " stl_c %0,%1\n" + " beq %0,2f\n" + " mb\n" + ".section .text2,\"ax\"\n" + "2: ldl %0,%1\n" + " blbs %0,2b\n" + " br 1b\n" + ".previous" + : "=r" (tmp), "=m" (__dummy_lock(lock)) + : "m"(__dummy_lock(lock))); +} + +extern void _spin_unlock_(spinlock_t *lock_ptr); +#define spin_unlock(lock) _spin_unlock_(lock) +static inline void nonmetered_spin_unlock(spinlock_t *lock) { + mb(); + lock->lock = 0; +} +extern __inline__ int _spin_trylock_(spinlock_t *); +#define spin_trylock(lock) _spin_trylock_(lock) +#define nonmetered_spin_trylock(lock) (!test_and_set_bit(0,(lock))) +#define nonmetered_spin_testlock(lock) test_bit(0,(lock)) +#endif /* !CONFIG_LOCKMETER */ +#endif/* DEBUG_SPINLOCK */ /***********************************************************/ typedef struct { volatile int write_lock:1, read_counter:31; +#if defined(CONFIG_LOCKMETER) + /* required for LOCKMETER since all bits in lock are used */ + /* need this storage for CPU and lock INDEX ............. */ + unsigned magic; +#endif } /*__attribute__((aligned(32)))*/ rwlock_t; -#define RW_LOCK_UNLOCKED (rwlock_t) { 0, 0 } +#if defined CONFIG_LOCKMETER +#define RWLOCK_MAGIC_INIT , 0 +#else +#define RWLOCK_MAGIC_INIT /* */ +#endif + +#define RW_LOCK_UNLOCKED (rwlock_t) { 0, 0 RWLOCK_MAGIC_INIT } #if DEBUG_RWLOCK extern void write_lock(rwlock_t * lock); extern void read_lock(rwlock_t * lock); -#else +static inline void write_unlock(rwlock_t * lock) +{ + mb(); + *(volatile int *)lock = 0; +} + +static inline void read_unlock(rwlock_t * lock) +{ + long regx; + __asm__ __volatile__( + "1: ldl_l %1,%0\n" + " addl %1,2,%1\n" + " stl_c %1,%0\n" + " beq %1,6f\n" + ".subsection 2\n" + "6: br 1b\n" + ".previous" + : "=m" (__dummy_lock(lock)), "=&r" (regx) + : "m" (__dummy_lock(lock))); +} +#else /* DEBUG_RWLOCK */ +#ifndef CONFIG_LOCKMETER static inline void write_lock(rwlock_t * lock) { long regx; @@ -144,7 +213,6 @@ : "m" (__dummy_lock(lock)) ); } -#endif /* DEBUG_RWLOCK */ static inline void write_unlock(rwlock_t * lock) { @@ -166,5 +234,120 @@ : "=m" (__dummy_lock(lock)), "=&r" (regx) : "m" (__dummy_lock(lock))); } +# else /* CONFIG_LOCKMETER */ + +extern void _read_lock_(rwlock_t *rw); +#define read_lock(rw) _read_lock_(rw) +extern void _read_unlock_(rwlock_t *rw); +#define read_unlock(rw) _read_unlock_(rw) +extern void _write_lock_(rwlock_t *rw); +#define write_lock(rw) _write_lock_(rw) +extern void _write_unlock_(rwlock_t *rw); +#define write_unlock(rw) _write_unlock_(rw) +extern int _write_trylock_(rwlock_t *rw); +#define write_trylock(rw) _write_trylock_(rw) + +static inline void nonmetered_write_lock(rwlock_t * lock) +{ + long regx; + + __asm__ __volatile__( + "1: ldl_l %1,%0\n" + " bne %1,6f\n" + " or $31,1,%1\n" + " stl_c %1,%0\n" + " beq %1,6f\n" + " mb\n" + ".subsection 2\n" + "6: ldl %1,%0\n" + " bne %1,6b\n" + " br 1b\n" + ".previous" + : "=m" (__dummy_lock(lock)), "=&r" (regx) + : "0" (__dummy_lock(lock)) + ); +} + +static inline void nonmetered_read_lock(rwlock_t * lock) +{ + long regx; + + __asm__ __volatile__( + "1: ldl_l %1,%0\n" + " blbs %1,6f\n" + " subl %1,2,%1\n" + " stl_c %1,%0\n" + " beq %1,6f\n" + "4: mb\n" + ".subsection 2\n" + "6: ldl %1,%0\n" + " blbs %1,6b\n" + " br 1b\n" + ".previous" + : "=m" (__dummy_lock(lock)), "=&r" (regx) + : "m" (__dummy_lock(lock)) + ); +} + +static inline void nonmetered_write_unlock(rwlock_t * lock) +{ + mb(); + *(volatile int *)lock = 0; +} + +static inline void nonmetered_read_unlock(rwlock_t * lock) +{ + long regx; + __asm__ __volatile__( + "1: ldl_l %1,%0\n" + " addl %1,2,%1\n" + " stl_c %1,%0\n" + " beq %1,6f\n" + ".subsection 2\n" + "6: br 1b\n" + ".previous" + : "=m" (__dummy_lock(lock)), "=&r" (regx) + : "m" (__dummy_lock(lock))); +} + +static __inline__ int nonmetered_write_trylock(rwlock_t *lock) +{ + long temp,result; + + __asm__ __volatile__( + " ldl_l %1,%0\n" + " mov $31,%2\n" + " bne %1,1f\n" + " or $31,1,%2\n" + " stl_c %2,%0\n" + "1: mb\n" + : "=m" (__dummy_lock(lock)), "=&r" (temp), "=&r" (result) + : "m" (__dummy_lock(lock)) + ); + + return (result); +} + +extern __inline__ long nonmetered_read_trylock(rwlock_t *lock) +{ +long temp,result; + + __asm__ __volatile__( + " ldl_l %1,%0\n" + " mov $31,%2\n" + " blbs %1,1f\n" + " subl %1,2,%2\n" + " stl_c %2,%0\n" + "1: mb\n" + : "=m" (__dummy_lock(lock)), "=&r" (temp), "=&r" (result) + : "m" (__dummy_lock(lock)) + ); + return (result); +} + + + +#endif /* !CONFIG_LOCKMETER */ +#endif /* DEBUG_RWLOCK */ #endif /* _ALPHA_SPINLOCK_H */ --------------018661F9373187292F74591D--