Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implication of suspending a process that is not a direct child of the job-controlling shell #299

Open
Tracked by #238
magicant opened this issue Sep 10, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@magicant
Copy link
Owner

magicant commented Sep 10, 2023

The C version of yash generates a process tree like below when executing yes | less with job control enabled:

   PPID     PID    PGID     SID TTY        TPGID STAT   UID   TIME COMMAND
 515770  516152  516152  516152 pts/0     516152 Ss+   1000   0:00 /bin/yash
 516152  592476  592476  516152 pts/0     516152 T     1000   0:00  \_ yes
 516152  592477  592476  516152 pts/0     516152 T     1000   0:00  \_ less

This tree is quite simple.

On the other hand, the current yash-rs constructs a tree like this:

   PPID     PID    PGID     SID TTY        TPGID STAT   UID   TIME COMMAND
 516152  591807  591807  516152 pts/0     591807 S+    1000   0:00  \_ ./yash
 591807  592231  592231  516152 pts/0     591807 T     1000   0:00      \_ ./yash
 592231  592232  592231  516152 pts/0     591807 T     1000   0:00          \_ ./yash
 592232  592234  592231  516152 pts/0     591807 T     1000   0:00          |   \_ yes
 592231  592233  592231  516152 pts/0     591807 T     1000   0:00          \_ ./yash
 592233  592235  592231  516152 pts/0     591807 T     1000   0:00              \_ less

This is more complicated. The main job-controlling shell first spawns a child in a separate process group so that the child can be managed as a job. The child creates another two children to run the pipeline components. Each of the grandchild performs fork and exec to start the external utility.

In the both scenarios, the less utility turns on the alternate screen mode of the terminal so that it can go back to the original screen without losing the scroll-back history when the utility is finished or suspended. When less receives the SIGTSTP signal, it restores the terminal state, uninstalls the signal handler, and resends the signal to itself to actually suspend itself.

This scheme works fine for the simple process tree created by the C yash as the less utility is a direct child of the shell. A problem arises when yash-rs makes a more complicated tree where less is not a direct child of the shell. Remember that, when the user hits the susp key (typically Ctrl-Z), the SIGTSTP signal is sent to all processes in the foreground process group and that signal delivery is asynchronous among the processes. One process may receive the signal and react to it earlier than another. When the direct child of the topmost shell process receives the signal, it immediately suspends and the parent shell will proceed to the next command, but at this moment other processes in the process group may still be running, especially the less utility may still be interacting with the terminal in the alternate screen. This may cause unwanted side effects when the shell starts prompting for the next command line or executing the next job.

How can this issue be resolved?

  • Forked shell processes in the job process group block the SIGTSTP signal to prevent itself from being suspended by the signal. The shell processes monitor the state of child processes. When the last child suspends, the parent unblocks the signal to let itself suspend. This scheme makes sure the parent shell process suspends after the child does.
    • To make it possible to suspend a job-controlled subshell executing a built-in utility, the subshell should block SIGTSTP only while it is executing another child process.
  • Implementing the same optimization as the C yash would make the situation much less likely to happen, if not a perfect resolution.
  • Any other ideas?
@magicant magicant mentioned this issue Sep 10, 2023
25 tasks
@magicant magicant added the bug Something isn't working label Sep 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: To do
Development

No branches or pull requests

1 participant