OSR505 Signal trapping in shell scripts
Can't tell if it is or isn't, but certainly it's a _programming_ question...
I tested this in the same way: exiting telnet client. This might be
significant.
I learned a bunch of things, can't say I know _exactly_ what's going on,
but at least there are workarounds.
The proximate cause of not removing trapint.log is that your `rm`
command is receiving a SIGHUP. Oddly, if I change it to `rm -f` then it
does not receive a SIGHUP. I verified this with both `trace` and
`truss`.
One workaround is to put at the top of exithandler():
trap "" 1 # ignore SIGHUP from now on
I thought this might be due to ksh's complex process handling, so I
tried with the Bourne shell. Besides the "#!/bin/ksh", I had to change
the function declarations to the "exithandler() { ..." style. Your real
program probably has other Korn-isms in it, but the test code worked
under Bourne. It also behaved exactly the same, which was a surprise to
me.
I then changed the kernel parameter SECSTOPIO to 0 (I did so as follows
-- live kernel brain surgery, albeit very simple:
# scodb -w
scodb> sec_stopio=0
scodb> q
) -- and the problem went away. So this is an interaction with
stopio(S). `telnetd` calls stopio() on the tty you were logged in on.
This is a more persistent source of SIGHUPs than a mere hangup. If you
were running the script on a serial terminal on which you had _not_
logged in, merely opened the modem-control port and ran the script --
then the problem wouldn't exist. If you had logged in then it probably
would, because `login` or `init` would stopio() the port.
You didn't say what release of OSR5 you're using. I'm on 507. This
behavior was actually must worse in the past (502 and earlier); on those
releases, you would get a SIGHUP _every_ time you tried to access a file
descriptor on which stopio() had been called. Starting in 504, a
process only gets one SIGHUP per fd.
The question remains, what was `rm` doing to that fd which provoked a
SIGHUP? Checking the source... I see this:
if ((!fflag && isatty(0)) || iflag) {
/* then look up the localized regular expression for "yes" */
Kind of a silly code path. It knows that `rm -i`, and `rm` without "-f"
_when on an interactive terminal_, is potentially going to ask
questions. Not that it _is_, only potentially. isatty(0) is going to
end up doing ioctl()s on fd 0, which is your stopio()'d telnetd pty.
Thus, doom.
The workaround of doing "trap '' 1" will protect _anything_ you run out
of the exit function. You could also specifically protect `rm` by doing
rm trapint.pid >> trapint.log 2>&1 < /dev/null
.... but that's much shakier, depends on information from inside SCO's
`rm` source code that could change without your knowledge.
So...
Various workarounds (choose 1):
# at top of exit function: ignore SIGHUP
trap "" 1
# at top of exit function: close dangerous descriptors
exec </dev/null >/dev/null 2>&1 # at top of exit function
# protect just the rm command itself:
rm trapint.pid >> trapint.log 2>&1 < /dev/null
# disable stopio(S) system-wide:
cd /etc/conf/cf.d
./configure
"8" ... and set SECSTOPIO to 0
# then relink, reboot
So, turned out to be very OpenServer-specific. I understand that HP-UX
also has SecureWare stuff in it, but knowing how many years of drift we
have between the two implementations, I doubt the issues are similar.
Plus, it was enhanced by a specific silliness in OSR5 `rm`.
But... regarding that specific silliness: it's only silly because you
might have expected `rm` not to touch the tty. Contrast to some other
code you might have in your exit function:
echo Bailing! # the shell writes to stdout and is SIGHUP'd
cat exit.msg # `cat` writes to stdout and is SIGHUP'd
stty -a > trapint.exit-stty-settings # ioctl(stdin) and is SIGHUP'd
I still don't know precisely why the _shell_ saw SIGHUP twice (thus
entered traphandler() twice). I can speculate a bit, but haven't set up
the detailed tests to confirm this. I'm guessing that the shell might
not have received a SIGHUP from telnetd directly; `perl` did because it
was in the middle of an I/O when telnetd exited. So the perl statement
ended, and the shell tried to do:
echo "Shell sees result as $result"
which would write to stdout, so it goes a SIGHUP relating to the
stopio'd fd 1. Then it got into the trap handler and then the exit
handler. The exit handler ran `rm` as:
rm trapint.pid >> trapint.log 2>&1
This involves _closing_ fds 1 and 2 (in the process of reopening them to
"trapint.log"). I'm pretty sure the close(S) system call isn't
sensitive to stopio(), but suppose the shell does something else on the
way out. Maybe it flushes a stdio buffer; maybe it attempts an ioctl()
to see what sort of file it's closing. When it does these things to fd
1, nothing special happens (it's already been SIGHUP'd for fd 1). When
it does them to fd 2, kaboom.
.... untested theory.
|