xfs
[Top] [All Lists]

[PATCH] xfsdump: allow system() to obtain exit status

To: xfs@xxxxxxxxxxx
Subject: [PATCH] xfsdump: allow system() to obtain exit status
From: Bill Kendall <wkendall@xxxxxxx>
Date: Wed, 11 Jan 2012 15:07:53 -0600
Cc: Bill Kendall <wkendall@xxxxxxx>
In-reply-to: <CAGdb-8fyseE+ARMUWaQYcx1VCnmph=xL7XeTBUXg2wSSdF_7hw@xxxxxxxxxxxxxx>
References: <CAGdb-8fyseE+ARMUWaQYcx1VCnmph=xL7XeTBUXg2wSSdF_7hw@xxxxxxxxxxxxxx>
xfsdump explicitly ignores SIGCHLD in order to prevent librmt rsh
processes from becoming zombies. However, doing so interferes with the
ability for system() to determine a command's exit status.

Setting up a handler for SIGCHLD will not work either, since xfsdump is
now multi-threaded and the main thread (which handles signals) might
handle a child exit before the thread running system() can.

I also attempted to use waitpid() when tearing down a librmt session,
but this has the potential to block indefinitely if there is a problem
on the remote side. (And using WNOHANG tended to never catch the exit.)

In the end, I settled on just not touching SIGCHLD at all. There may be
a zombie rsh when librmt is used, but typically it will be alive until
the end of the backup and in any case will be cleaned up when
xfsdump/restore exits.

Signed-off-by: Bill Kendall <wkendall@xxxxxxx>
---
 common/main.c |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/common/main.c b/common/main.c
index 5880723..c9a311b 100644
--- a/common/main.c
+++ b/common/main.c
@@ -507,6 +507,13 @@ main( int argc, char *argv[] )
         * want to exit when a signal is received. otherwise, hold signals so
         * they don't interfere with sys calls; they will be released at
         * pre-emption points and upon pausing in the main loop.
+        *
+        * note that since we're multi-threaded, handling SIGCHLD causes
+        * problems with system()'s ability to obtain a child's exit status
+        * (because the main thread may process SIGCHLD before the thread
+        * running system() calls waitpid()). likewise explicitly ignoring
+        * SIGCHLD also prevents system() from getting an exit status.
+        * therefore we don't do anything with SIGCHLD.
         */
 
        sigfillset(&sa.sa_mask);
@@ -514,13 +521,9 @@ main( int argc, char *argv[] )
 
        /* always ignore SIGPIPE, instead handle EPIPE as part
         * of normal sys call error handling.
-        *
-        * explicitly ignore SIGCHLD so that if librmt rsh sessions
-        * exit early they do not become zombies.
         */
        sa.sa_handler = SIG_IGN;
        sigaction( SIGPIPE, &sa, NULL );
-       sigaction( SIGCHLD, &sa, NULL );
 
        if ( ! pipeline ) {
                sigset_t blocked_set;
-- 
1.7.0.4

<Prev in Thread] Current Thread [Next in Thread>