From: Pavel Mayer (pavel++at++ARTCOM.DE)
Date: 12/10/2001 02:58:09
Hi there,
we experience this problem too, and it is a bug in the performer
libraries that causes
an semaphore undo buffer overflow; SGI Performer manipulates the System
V sempahores
with incorrectly paired SEM_UNDO flag sets; this problem gets only
reported on kernel
versions after 2.4.10 as Alan Cox added some code that enforces the
SEM_UNDO range limit
in 2.4.11.
My wild guess is that they try a non-blocking lock acquisition with only
the
IPC_NOWAIT flag set,but do the general unlocking with the SEM_UNDO flag
set
so that after 32768 lock/unlock cycles the undo counter will overflow.
This problem might not affect the actual locking mechanism except
possible deadlocks
on unexpected process termination, but performer error handling
mechanism may
be harmful or is at least annoying.
You can use an older kernel or a compile a new kernel with an old
version of the
file kernel/ipc/sem.c to avoid the error beeing reported; however, it
will just cure
the symptom, not the cause.
If I am right SGI will have to fix this problem by using
SEM_UNDO{|IPC_NOWAIT}
when acquiring a lock and SEM_UNDO when releasing it, and make generally
sure
they use always a matching SEM_UNDO flag (either always set or always
clear)
on the same semaphore.
It sometimes really sucks not to have the Perfomer source code
available:
currently I am stripping the Performer 2.5 libraries of superflous
symbols
that cause exception handling malfunction with new compiler/libgcc
versions
where a simple recompile would fix it, and tracking down memory leaks or
debugging is also often a real pain without the sources.
Pavel
In reply to:
-----------------
Von: Jason Daly [mailto:jdaly++at++ist.ucf.edu]
Gesendet: Samstag, 8. Dezember 2001 01:17
An: info-performer++at++sgi.com
Betreff: semaphore error
Hi, when running Perfly (or any of the Performer programs we've written)
for a certain length of time (about 10 minutes, give or take), the
program
crashes with this error:
PF Fatal/Internal: : Numerical result out of range
PF Warning/SysErr(34): Error posting semaphore 248011 [9] in pid
2013
PF Fatal/Internal: PF_USPSEMA() errno = 0
(the semaphore and pid numbers are obviously different each time). This
happens whether or not I have set the _PF_SEMA_NEW_PARADIGM variable to
0.
[more text deleted]
This archive was generated by hypermail 2b29 : Mon Dec 10 2001 - 02:58:55 PST