xfs
[Top] [All Lists]

re[2]: Snapshot regression test [WAS: re[6]: Summary - Snapshot Effort]

To: Michael Best <mbest@xxxxxxxxxxxxx>
Subject: re[2]: Snapshot regression test [WAS: re[6]: Summary - Snapshot Effort]
From: Greg Freemyer <freemyer@xxxxxxxxxxxxxxxxx>
Date: Tue, 27 Aug 2002 18:42:24 -0400
Cc: Nathan Scott <nathans@xxxxxxx>, <linux-xfs@xxxxxxxxxxx>
Organization: The NorcrossGroup
Sender: linux-xfs-bounce@xxxxxxxxxxx
Michael & Danny (& Nathan when the sun starts shining down there :),

Thanks for the ideas, but this seems to be way more complicated than it should 
be.  I just don't really know where to put in the logic you each recommended.

FYI: I have been writing shell scripts for many years on an as needed basis, 
but this one is at the very edge of my skills.

FIrst, if the "xfs test harness" ran the test scripts in the standard 
non-interactive mode, this issue would not come up at all.  Unfortunately from 
my perspective, it somehow invokes the scripts in interactive mode.i.e. The 
shell notification messages I'm having problems with only occur in interactive 
shells.  Non-interactive shells simply don't have these messages.

If there is a way to have my script run as non-interactive, then the whole 
problem goes away.  (I just figured that out in the last hour or so.)

Ignoring that possibility, I have created as simple of a script as I could to 
show the problem.

It is below my signature.  (If anyone has a better way to do a timeout, I'm all 
ears.  I've never done one in shell code before.)

Details:

I have resolved the specific dd loop issue by changing it from a "while true" 
to "while $RUNNING" and a reset the RUNNING variable in my cleanup logic.

I'm now only have problems with the timeout subshell I'm creating, but the 
problem is very similar.

In the script, you will see the first thing you have to do is choose the 
scenario you are trying to model: snapshot success, or hang

To run the script use the ". test_script" syntax.  This runs it interactively 
like the xfs test harness does.  

Warning: This script kills your current shell, so invoke a subshell to run the 
test_script in each time you want to run it.

In my normal shell with a success I get output like 
[1]-  Done                    ( sleep $SIMULATED_SNAPSHOT_DELAY )
Snapshot success
cleanup occurs here

WIth a simulated lockup I get:
[1]+  Done                    sleep 10
snapshot creation lockup
cleanup occurs here

Unfortunately, when the shell notifications that occur inside the xfs test 
harness have the pid instead of the subshell instance #.

What I need to do is get rid of those Done messages, or get them to be 
consistent.  i.e. Without a pid that changes on every invocation.


Greg Freemyer
Internet Engineer
Deployment and Integration Specialist
Compaq ASE - Tru64 v4, v5
Compaq Master ASE - SAN Architect
The Norcross Group
www.NorcrossGroup.com

==== Sample script to show problem
#!/bin/sh

#Choose one of the below based on whether you are testing snapshot success, or 
lockup

#SIMULATED_SNAPSHOT_DELAY=5                # simulated success
SIMULATED_SNAPSHOT_DELAY=5000              # simulated lockup

status=1    #default to failure

_cleanup()
{
echo cleanup occurs here
trap 0 1 2 3 15
exit $status
}

trap "_cleanup" 0 1 2 3 15

# Start of real code

sleep $SIMULATED_SNAPSHOT_DELAY &
SNAPSHOT_pid=$!

(
sleep 10 &               #This is my timeout for lvcreate to finish
TIMERpid=$!              # Save my pid, so I can be cancelled
wait $TIMERpid
echo snapshot creation lockup
# xfs_freeze -u /scratch # This will allow the lvcreate to run to completion
kill $SNAPSHOT_pid       # For this test script, just kill the sleep, but the 
kill has no effect on the real hung process.
kill $$                  # Terminate this whole test
) &
TIMER_shell_pid=$!       # Save the whole subshells pid, so it can be cancelled

wait $SNAPSHOT_pid

kill $TIMER_shell_pid $TIMERpid         #cancel the timeout

echo Snapshot success
status=0     # success
exit


<Prev in Thread] Current Thread [Next in Thread>