I just wanted people to know that the version of Job I plan to post
using the new pnotify version of pagg is not the jobfs variant.
The last time Job got a bunch of community feedback, they suggested using
a jobfs implementation instead of the /proc/job ioctl interface.
We implemented that.
It does work, but for certain customer situations, the overhead of the
inode operations to control job are quite costly. Although most
customers wouldn't hit this, at least one big customer would have.
In one of the test suite tests, we fork like 40,000 processes maybe more
to see if job suffers from a duplicate JID issue that a customer reported.
In that test case, where job controls are issued for each process at
least once, the run time of the test takes 10 minutes or more compared
to less than 20 seconds with the old version. The hold-up was due to
inode operations in jobfs.
We were trying to decide which way to go -- to try to figure out if there
is a way to speed up the inode operations or just go with the tried-and-true
During this time, we found a couple other bugs that I didn't fix
because I didn't know which way we were going - jobfs or the old way.
Some bugs that will be fixed in the version of job I'm planning to post
- Duplicate JIDs possible when process table wraps - we changed JID
computation to be based on a counter instead of a PID
- Some code that never executes was purged from job_sys_create
- A hang (locking logic error) was possible in rare situations in
- send_sig_info doesn't check for signal zero (status check) any more, so
we changed to use group_send_sig_info which requires the tasklist to be
locked during the call. The bug here was that an invalid signal
ended up being passed that could wakeup things that didn't expect
to be woken up.
I just wanted folks to know what was going on with the job patch.
Erik Jacobson - Linux System Software - Silicon Graphics - Eagan, Minnesota