Why do shells call fork()?

Question

When a process is started from a shell, why does the shell fork itself before executing the process?

For example, when the user inputs grep blabla foo, why can't the shell just call exec() on grep without a child shell?

Also, when a shell forks itself within a GUI terminal emulator, does it start another terminal emulator? (such as pts/13 starting pts/14)

score 40 · Answer 1 · edited Nov 27 '21 at 22:32

40

When you call an exec family method, it doesn't create a new process. Instead, exec replaces the current process memory and instruction set etc., with the process you want to run.

As an example: you want to run grep using exec

bash is a process (which has separate memory, address space). Now when you call exec(grep), exec will replace the current process's memory, address space, instruction set etc. with grep's data.

That means the bash process will no longer exist.

As a result, you can't get back to terminal after completing the grep command.

That's why exec family methods never return. You can't execute any code after exec; it is unreachable.

edited Nov 27 '21 at 22:32

Jonta

105

answered Mar 02 '14 at 17:49

shantanu

8,599

Almost ok --- I substituted Terminal with bash. ;-) – Rmano Mar 02 '14 at 18:00
2

BTW, you can tell bash to execute grep without forking first, by using the command exec grep blabla foo. Of course, in this particular case, it won't be very useful (since your terminal window will just close as soon as the grep finishes), but it can be occasionally handy (e.g. if you're starting another shell, perhaps via ssh / sudo / screen, and don't intend to return to the original one, or if the shell process you're running this on is a sub-shell that's never meant to execute more than one command anyway). – Ilmari Karonen Mar 02 '14 at 19:50
7

Instruction Set has very specific meaning. And it's not the meaning you are using it in. – Andrew Savinykh Mar 03 '14 at 19:15
@IlmariKaronen It would be useful in wrapper scripts, where you want to prepare arguments and environment for a command. And the case you mentioned, where bash is never meant to run more than one command, that's actually bash -c 'grep foo bar' and calling exec there is form of optimization bash does for you automatically – Sergiy Kolodyazhnyy Sep 23 '18 at 21:23

Sergiy Kolodyazhnyy · Answer 2 · 2018-09-23T21:29:55.793

TL;DR: Because this is the optimal method for creating new processes and keeping control in interactive shell

fork() is necessary for processes and pipes

To answer the specific part of this question, if grep blabla foo were to be called via exec() directly in parent, parent would seize to exist, and its PID with all the resources would be taken over by grep blabla foo.

However, let's talk in general about exec() and fork(). The key reason for such behavior is because fork()/exec() is the standard method of creating a new process on Unix/Linux, and this isn't a bash specific thing; this method has been in place since the beginning and influenced by this same method from already existing operating systems of the time. To somewhat paraphrase goldilocks's answer on a related question, fork() for creating new process is easier since the kernel has less work to do as far as allocating resources goes, and a lot of the properties ( such as file descriptors, environment, etc) - all can be inherited from the parent process ( in this case from bash).

Secondly, as far as interactive shells go, you can't run an external command without forking. To launch an executable which lives on disk (for example, /bin/df -h ), you have to call one of exec() family functions, such as execve(), which will replace the parent with the new process, take over its PID and existing file descriptors,etc. For interactive shell, you want the control to return to the user and let the parent interactive shell carry on. Thus, the best way is to create a subprocess via fork(), and let that process be taken over via execve(). So interactive shell PID 1156 would spawn a child via fork() with PID 1157, then call execve("/bin/df",["df","-h"],&environment), which makes /bin/df -h run with PID 1157. Now the shell only has to wait for process to exit and return control to it.

In case where you have to create a pipe between two or more commands, say df | grep, you need a way to create two file descriptors (that's read and write end of pipe which come from pipe() syscall), then somehow let two new processes inherit them. That's done forking new process and then by copying the write end of the pipe via dup2() call onto its stdout aka fd 1 (so if write end is fd 4, we do dup2(4,1)). When exec() to spawn df happens the child process will think nothing of its stdout and write to it without being aware (unless it actively checks) that its output actually goes a pipe. Same process happens to grep, except we fork(), take read end of pipe with fd 3 and dup(3,0) before spawning grep with exec(). All this time parent process is still there, waiting to regain control once pipeline done complete.

In case of built-in commands,generally shell doesn't fork(), with exception of source command. Subshells require fork().

In short, this is a necessary and useful mechanism.

Disadvantages of forking and optimizations

Now, this is different for non-interactive shells, such as bash -c '<simple command>'. Despite fork()/exec() being optimal method where you have to process many commands, it's a waste of resources when you have only one single command. To quote Stéphane Chazelas from this post:

Forking is expensive, in CPU time, memory, allocated file descriptors... Having a shell process lying about just waiting for another process before exiting is just a waste of resources. Also, it makes it difficult to correctly report the exit status of the separate process that would execute the command (for instance, when the process is killed).

Therefore, many shells ( not just bash ) use exec() to allow that bash -c '' to be taken over by that single simple command. And exactly for the reasons stated above, minimizing pipelines in shell scripts is better. Often you can see beginners do something like this:

cat /etc/passwd | cut -d ':' -f 6 | grep '/home'

Of course, this will fork() 3 processes. This is a simple example, but consider a large file, in range of Gigabytes. It'd be far more efficient with one process:

awk -F':' '$6~"/home"{print $6}' /etc/passwd

Waste of resources actually can be a form of Denial of Service attack, and in particular fork bombs are created via shell functions that call themselves in pipeline, which forks multiple copies of themselves. Nowadays, this is mitigated via limiting maximum number of processes in cgroups on systemd, which Ubuntu also uses since version 15.04.

Of course that doesn't mean forking is just bad. It's still a useful mechanism as discussed before, but in case where you can get away with less processes, and consecutively less resources and thus better performance, then you should avoid fork() if possible.

Why do shells call fork()?

4 Answers4

fork() is necessary for processes and pipes

Disadvantages of forking and optimizations

See also

Linked