Sie sind auf Seite 1von 5

The subshell created using parentheses does not use an execve() call for the new process, the

calling of the script does. At this point the variables from the parent shell are handled differently:
The execve() passes a deliberate set of variables (the script-calling case) while not calling execve()
(the parentheses case) leaves the complete set of variables intact.

Your probing using strace should have shown exactly that difference; if you did not see it, I can
only assume that you made one of several possible mistakes. I will just strip down what I did to
show the difference, then you can decide for yourself where your error was.

I created two traces. The first was done using


strace -f -o bash-mystery-1.strace bash -c 'v=15; (echo $v)'
and the second was done using
strace -f -o bash-mystery-2.strace bash -c 'v=15; ./x.sh'
(with x.sh being an executable script.)

Option -f was necessary to trace the children of the parent shell (the bash in the command line).

These traces I compared using diff -y -W 300 after equalizing all typical and frequent differences
like addresses and the PIDs:
q() {
sed -e 's/0x[0-9a-f]*/ADDR/g' \
-e 's/12923\|12927/PARENT/g' \
-e 's/12924\|12928/CHILD/g'
}
diff -y -W 300 <(q < bash-mystery-1.strace) <(q < bash-mystery-2.strace) | less -S
12923 and 12927 were my parent PIDs and 12924 and 12928 were my child PIDs (which I found
out by scanning through the trace files). You will of course have different numbers, so adjust
these. And you will need a very wide terminal (more than 200 characters) to view the diff output
properly. So make your window wide ;-)

Around line 140 I find a clone() call which is more or less a fork(), so it splits the current process
into two. Around there the CHILD starts doing things, as we see in the following lines in the trace.
Around line 165 I then see the call of execve(), but only in the trace of the case which calls the
script, so there the child voluntarily gives up a lot of its environment and sets a deliberate one. The
parentheses case does not change its environ ment and sets a deliberate one. The parentheses case
does not change its environment (it does not call execve()), so the child process continues to have
the full set.

You have to export your var for child process:


export var=15
Once exported, the variable is used for all children process at the launch time (not export time).
var=15
export var
is same as
export var
var=15
is same as
export var=15
Export can be cancelled using unset. Sample: unset var.

The solution for this mystery is that subshells inherit everything from the parent shell including all
shell variables because they are simply called with fork or clone so they share the same memory
space with the parent shell , that's why this will work
$ var=15
$ (echo $var)
15

But in the ./file , the subshell will be later followed by exec or execv system call which will clear
all the previous parent variables but we still have the environment variables you can check this out
using strace using -f to monitor the child subshell and you will find that there is a call to execv
[pid 26387] execve("./file", ["./file"], [/* 75 vars */]) = -1 ENOEXEC (Exec format error)

A subshell is not a completely new process,but a fork of the existing process.


(echo "$$ $(date)" >> $HOME/.debug.zshenv)
For that matter, the expression $(date) also entails creating a subshell.

If you call zsh explicitly e.g.


zsh -c 'echo "$$ $(date)" >> $HOME/.debug.zshenv'
then the shell forks, calls execve() and by that starts a completely new shell which does the initiali
zation again.

Since, according to zshell(1), $ZDOTDIR/.zshenv gets sourced whenever a new instance of zsh
starts, then any command in $ZDOTDIR/.zshenv that results in the creation of a "a completely
new [zsh] process" would result in an infinite regress. On the other hand, including either of the
following lines in a $ZDOTDIR/.zshenv file does not result in an infinite regress:

echo $(date; printenv; echo $$) > /dev/null #1


(date; printenv; echo $$) #2
The only way I found to induce an infinite regress by the mechanism described above was to
include a line like the following1 in the $ZDOTDIR/.zshenv file:
$SHELL -c 'date; printenv; echo $$' #3

My questions are:
What difference between the commands marked #1, #2 above and the one marked #3 accounts
from this difference in behavior?
If the shells that get created in #1 and #2 are called "subshells", what are those like the one
generated by #3 called?
Is it possible to rationalize (and maybe generalize) the empirical/anecdotal findings described
above in terms of the "theory" (for lack of a better word) of Unix processes?

The motivation for the last question is to be able to determine ahead of time (i.e. without resorting
to experimentation), what commands would lead to an infinite regress if they were included in
$ZDOTDIR/.zshenv?

1 The particular sequence of commands date; printenv; echo $$ that I used in the various examples
above is not too important. They happen to be commands whose output was potentially helpful
towards interpreting the results of my "experiments". (I did, however, want these sequences to
consist of more than one command, for the reason explained here.)

Note that echo $$ in a subshell explicitly prints the PID of the parent shell, per POSIX. It probably
isn't showing you what you wanted. Note also that zsh is well-known for aggressively optimising
subshell execution, so you probably need still more care to get the effect you want. strace -o
trace=process -f zsh -c ' ... ' is a good way to check your intuitions.

If you focus on the word "starts" here you'll have a better time of things. The effect of fork() is to
create another process that begins from exactly where the current process already is. It's cloning an
existing process, with the only difference being the return value of fork. The documentation is
using "starts" to mean entering the program from the beginning.

Your example #3 runs $SHELL -c 'date; printenv; echo $$', starting an entirely new process from
the beginning. It will go through the ordinary startup behaviour. You can illustrate that by, for
example, swapping in another shell: run bash -c ' ... ' instead of zsh -c ' ... '. There's nothing special
about using $SHELL here.

Examples #1 and #2 run subshells. The shell forks itself and executes your commands inside that
child process, then carries on with its own execution when the child is done.

The answer to your question #1 is the above: example 3 runs an entirely new shell from the start,
while the other two run subshells. The startup behaviour includes loading .zshenv.

The reason they call this behaviour out specifically, which is probably what leads to your
confusion, is that this file (unlike some others) loads in both interactive and non-interactive shells.

To your question #2:


if the shells that get created in #1 and #2 are called "subshells", what are those like the one
generated by #3 called?
If you want a name you could call it a "child shell", but really it's nothing. It's no different than
any other process you start from the shell, be it the same shell, a different shell, or cat.
To your question #3:
Is it possible to rationalize (and maybe generalize) the empirical/anecdotal findings described
above in terms of the "theory" (for lack of a better word) of Unix processes?
fork makes a new process, with a new PID, that starts running in parallel from exactly where this
one left off. exec replaces the currently-executing code with a new program loaded from some
where, running from the beginning. When you spawn a new program, you first fork yourself and
then exec that program in the child. That is the fundamental theory of processes that applies
everywhere, inside and outside of shells.

Subshells are forks, and every non-builtin command you run leads to both a fork and an exec.
Note that $$ expands to the PID of the parent shell in any POSIX-compatible shell, so you may
not be getting the output you expect regardless. Note also that zsh aggressively optimises subshell
execution anyway, and commonly execs the last command, or doesn't spawn the subshell at all if
all the commands are safe without it.

One useful command for testing your intuitions is: strace -e trace=process -f $SHELL -c ' ... '
That will print to standard error all process-related events (and no others) for the command ... you
run in a new shell. You can see what does and does not run in a new process, and where execs
occur.

Another possibly-useful command is pstree -h, which will print out and highlight the tree of parent
processes of the current process. You can see how many layers deep you are in the output.

When the manual says the commands in .zshenv are "sourced", it means they are executed within
the shell running them. They do not cause a call to fork(), thus they do not spawn a subshell. Your
third example explicitly runs a subshell, calling invoking a call to fork(), and thus infinitely
recurses. That, I believe, should (at least partially) answer your first question.

There is nothing "created" in commands 1 and 2, so there's nothing to be called anything - those
commands are run within the context of the sourcing shell.This is not really true (although in the
concrete case, zsh may optimise it away). There is at least notionally a forked subshell there in
every case.

The generalization is the difference between "calling" a shell routine or program and "sourcing" a
shell routine or program - with the latter usually only being applicable to shell commands / scripts,
not external programs. "Sourcing" a shell script is usually done via . <scriptname> as opposed
to ./<scriptname> or /full/path/to/script - note the "dot-space" sequence at the start of the sourcing
directive. Sourcing can also be invoked using source <scriptname>, the source command being a
shell internal.

fork, assuming all goes well, returns twice. One return is in the parent process (which has the
original process ID), and the other in the new child process (a different process ID but otherwise
sharing much in common with the parent process). At this point, the child could exec(3) some
thing, which would cause some "new" binary to be loaded into that process, though the child need
not do that, and could run other code already loaded via the parent process (zsh functions, for
example). Hence, a fork may or may not result in a "completely new" process, if "completely
new" is taken to mean something loaded via an exec(3) system call.

Guessing which commands cause infinite regress in advance is tricky; besides the fork-calling-
fork case (a.k.a. a "forkbomb"), another easy one is via a naive function wrapper around some
command
function ssh() {
ssh -o UseRoaming=no "$@"
}
which instead probably should be written as

function ssh() {
=ssh -o UseRoaming=no "$@"
}
or command ssh ... to avoid infinite function calls of the ssh function calling the ssh function
calling the ... This in no way involves fork, as the function calls are internal to the ZSH process,
but will merrily happen off to infinity until some limit is bumped into by that single ZSH process.
strace, as always, is handy in revealing exactly what system calls are involved for any command
(in particular here fork and perhaps some exec call); shells may be debugged with -x or similar
that shows what the shell is doing internally (e.g. function calls). For more reading, Stevens in
"Advanced Programming in the Unix Environment" has a few chapters related to the creation and
handling of new processes.

The command, env, nohup, time, and xargs utilities have been specified to use exit code 127 if an
error occurs so that applications can distinguish "failure to find a utility" from "invoked utility
exited with an error indication". The value 127 was chosen because it is not commonly used for
other meanings; most utilities use small values for "normal error conditions" and the values above
128 can be confused with termination due to receipt of a signal. The value 126 was chosen in a
similar manner to indicate that the utility could be found, but not invoked. Some scripts produce
meaningful error messages differentiating the 126 and 127 cases. The distinction between exit
codes 126 and 127 is based on KornShell practice that uses 127 when all attempts to exec the
utility fail with [ENOENT], and uses 126 when any attempt to exec the utility fails for any other
reason.

The following exit values shall be returned:


126 The utility specified by command_name was found but could not be invoked.
127 An error occurred in the command utility or the utility specified by command_name could
not be found.
Otherwise, the exit status of command shall be that of the simple command specified by the
arguments to command.

Das könnte Ihnen auch gefallen