342

The chrome browser was not responsive and I tried to kill it, but instead of disappearing the process had <defunct> at its right, and didn't get killed:

enter image description here

What is <defunct> for a process and why it doesn't it get killed?

digitalextremist
  • 335
  • 5
  • 14
  • 5
    The accepted answer mentions that "kill -9 PID don't work". It's partially true: in reality, NO kill will work. Besides, -9 should be used as a last resort. 99% of the time a default kill of the parent process will kill it AND reap all the children. A "default kill" is a SIGTERM (-15). I encourage fans of the -9 (SIGKILL) to read http://stackoverflow.com/questions/690415/in-what-order-should-i-send-signals-to-gracefully-shutdown-processes/690631#690631 – Mike S Sep 01 '16 at 15:39
  • 1
    https://stackoverflow.com/questions/356722/killing-a-defunct-process-on-unix-system – Aᴍɪʀ Apr 07 '18 at 20:58
  • 1
    names matter a lot, presenting <zombie> instead of <defunct> would explain itself why kill is not an option. You cannot kill a zombie. – Sławomir Lenart Mar 01 '21 at 17:57

7 Answers7

317

From your output we see a "defunct", which means the process has either completed its task or has been corrupted or killed, but its child processes are still running or these parent process is monitoring its child process. To kill this kind of process, kill -9 PID doesn't work. You can try to kill them with this command but it will show this again and again.

Determine which is the parent process of this defunct process and kill it. To know this run the command:

$ ps -ef | grep defunct
    UID          PID     PPID       C    STIME      TTY          TIME              CMD
    1000       637      27872      0   Oct12      ?        00:00:04 [chrome] <defunct>
    1000      1808      1777       0    Oct04     ?        00:00:00 [zeitgeist-datah] <defunct>

Then kill -9 637 27872, then verify the defunct process is gone by ps -ef | grep defunct.

dessert
  • 39,982
Paddington
  • 3,472
  • 32
    you can't kill "defunct" process. You only can speed up the deletion of its entry in a process table by killing its parent. – jfs Feb 27 '14 at 20:58
  • 100
    What if the ppid is 1 (init)? Suppose I'll just have to wait? – Luc May 06 '14 at 05:42
  • 12
    to automate the kill, you can do this, too (might need to change which bytes you're cutting from the output): ps -ef | grep defunct | grep -v grep | cut -b8-20 | xargs kill -9 – warren Jan 21 '15 at 19:34
  • 5
    @warren Thanks. You can also make that slightly shorter and (imo) simpler by not doing a second grep. Just change the first grep to grep [d]efunct or similar and it won't match itself. – Vala Jul 26 '16 at 11:35
  • 13
    @warren you can't kill a defunct process- even with a SIGKILL. Furthermore, you're using kill -9 pretty indiscriminately. See http://stackoverflow.com/questions/690415/in-what-order-should-i-send-signals-to-gracefully-shutdown-processes/690631#690631 . If you want to kill defunct children, you might try: parents_of_dead_kids=$(ps -ef | grep [d]efunct | awk '{print $3}' | sort | uniq | egrep -v '^1$'); echo "$parents_of_dead_kids" | xargs kill. Rerun the script after 30 seconds or so, with the kill -9 if you desire. (Note that I specifically disallow killing of Init) – Mike S Sep 01 '16 at 14:50
  • @MikeS I didn't say it was the best option. I said it would speed-up what JFSebastian said. As with any code on the intarwebs, I would expect you to try it before you just deploy it in cron, for example (and gave a disclaimer on which bytes are being cut) – warren Sep 01 '16 at 15:00
  • 1
    @Thor84no the | grep -v grep addition has become habit because it's easier to think about . There are ways of not needing to use it, however - and that is a good suggestion. – warren Sep 01 '16 at 15:01
  • what if this did not work for me? – xeruf May 23 '18 at 21:15
  • 1
    @Xerus Then restart the system. That'll kill all processes. – John Strood Jun 27 '18 at 06:11
  • What @Luc said. Exactly what Luc said. – aroth Aug 22 '18 at 05:35
  • 1
    Why -9 is necessary for parent? – anatoly techtonik Dec 25 '18 at 10:40
  • 1
    If ppid is 1 see https://unix.stackexchange.com/questions/5642/what-if-kill-9-does-not-work/5648#5648 – rogerdpack Apr 30 '19 at 15:26
  • And what if the parent process is uninterruptible? Is there really nothing to do other than restarting the computer? – Andyc May 28 '19 at 17:38
  • I have dozens of defunct processes I can't kill, the shutdown command won't work, and even through the GUI, the shutdown menu won't come up – JoeManiaci Dec 22 '20 at 22:53
  • this killed my computer – br4nnigan Oct 02 '22 at 17:56
  • if you prefer GUI's then open htop (apt install htop), type t to switch to tree mode, then / to search for your process in the tree, then select its parent process and press k to kill and (optionally) select signal 9 SIGKILL instead of the default SIGTERM – ccpizza Oct 17 '23 at 16:52
91

Manual page ps(1) says:

Processes marked <defunct> are dead processes (so-called "zombies") that remain because their parent has not destroyed them properly. These processes will be destroyed by init(8) if the parent process exits.

You can't kill it because it is already dead. The only thing left is an entry in the process table:

On Unix and Unix-like computer operating systems, a zombie process or defunct process is a process that has completed execution but still has an entry in the process table. This entry is still needed to allow the parent process to read its child's exit status.

There is no harm in letting such processes be unless there are many of them. Zombie is eventually reaped by its parent (by calling wait(2)). If original parent hasn't reaped it before its own exit then init process (pid == 1) does it at some later time. Zombie Process is just:

A process that has terminated and that is deleted when its exit status has been reported to another process which is waiting for that process to terminate.

jfs
  • 4,008
  • 3
    "There is no harm in letting such processes be unless there are many of them". This isn't true. These defunct processes can still keep file handles open (lock files for instance), and ports open. Sometimes there is no saving these processes without a system reboot, as far as I can tell. – Scott Nov 01 '19 at 20:10
  • 2
    @Scott: why do you think a dead process keeps file handles open? Do you have a link to docs, a script that would demonstrate such behavior? – jfs Nov 05 '19 at 18:46
  • Unfortunately I don't have any evidence beyond the fact that I've seen it happen, and I don't know the next time one of the processes will get stuck again to repro. Most recently I verified using "lsof" on a file that it was being held open by the same pid as my defunct process. Previously I've (using netstat/lsof) seen that my defunct process still held the port it was listening on open. This has been enough of a problem that I've built defense in my init.d scripts to wait for the defunct process to clear when doing restarts so the new process can bind the port. Will screenshot if I repro – Scott Nov 06 '19 at 13:08
  • 2
    @Scott: It doesn't look like the issues you've mentioned are related to the process being a zombie as in I would be surprised that a dead process may keep file handles open. Let's avoid unsubstantiated misleading claims – jfs Nov 07 '19 at 10:28
  • If you don't accept that I've seen defunct processes hold ports and files open, that's fine, but my reasoning is neither unsubstantiated nor misleading. It's substantiated by empirical evidence and the whole reason I said it the first place was to ensure that readers of the answer do not get mislead. Doing some more research, it seems that at least for ports, sockets can stay open, even if the process is technically dead. So the end result for the user is the same, the port is held open. https://superuser.com/questions/1196736/how-can-a-zombie-process-hold-system-resources-like-a-tcp-port – Scott Nov 07 '19 at 18:41
  • 1
    @Scott: is there really a connection between a process being zombie (not just dead) and "port is held open"? (the fact that network resources may outlive a single process is not under question https://idea.popcount.org/2019-09-20-when-tcp-sockets-refuse-to-die/ ) – jfs Nov 10 '19 at 06:38
20

expanding on Paddington's answer..

From your output we see a defunct, which means this child process has either completed its task or has been corrupted or killed. Its parent process is still running and has not noticed its dead child.

kill -9 PID won't work (already dead).

To determine the parent of this child process, run this command:

ps -ef | grep defunct

 UID  PID **PPID** C STIME TTY TIME     CMD
 1000 637  27872   0 Oct12 ?   00:00:04 [chrome] <defunct>

See who the parent is: ps ax | grep 27872

If you want you can kill the parent, and the defunct will go away. kill -9 27872

see Jfs answer for a more technical reasoning.

Kevin
  • 992
  • 9
  • 15
  • 1
    How do I differentiate between a process that is completed & a process that is killed or corrupted?? – Ajay A Jun 14 '21 at 11:00
  • @AjayA why do you want to distinguish between them? A process that exited either success (exit 0) or fail (non zero exit code). in either case, the parent is too busy to pay attention to the child process that is not running. – Kevin Jun 15 '21 at 05:21
  • I have use case, where I want kill the child & parent process, if the child process is killed or corrupted & do not want do this in process complete case – Ajay A Jun 15 '21 at 09:24
  • 1
    @AjayA I would poke around in Proc, cat /proc/pid/status, or other entries in /proc/pid/xxx – Kevin Jun 16 '21 at 13:12
5

I accidently create <defunct> processes by

  • starting them from the terminal and
  • then putting them into the background by accident (Ctrl+Z) and
  • somehow terminating the programm.

Solution is to try the command fg in every open terminal window. Then the defunct processes disappear.

Phaiax
  • 51
3

Adding to @Paddington's answer, I added this function to my bashrc for quick checking:

defunct(){
    echo "Children:"
    ps -ef | head -n1
    ps -ef | grep defunct
    echo "------------------------------"
    echo "Parents:"
    ppids="$(ps -ef | grep defunct | awk '{ print $3 }')"
    echo "$ppids" | while read ppid; do
        ps -A | grep "$ppid"
    done
}

It outputs something like:

Children:
UID        PID  PPID  C STIME TTY          TIME CMD
user     25707 25697  0 Feb26 pts/0    00:00:00 [sh] 
user     30381 29915  0 11:46 pts/7    00:00:00 grep defunct
------------------------------
Parents:
25697 pts/0    00:00:00 npm
1

Thank you Mike S. We took your line and wrote a script that will kill defunct processes whose parent is in.telnetd. We didn't want it to kill any parent process, just telnetd that we know is causing a problem and we'll run it multiple times to kill multiple ones if needed.

# egrep -v '^1$ = Make sure the process is not the init process.
# awk '{print $3}' = Print the parent process.

first_parent_of_first_dead_kid=$(ps -ef | grep [d]efunct | awk '{print $3}' | head -n1 | egrep -v '^1$')
echo "$first_parent_of_first_dead_kid"

# If the first parent of the first dead kid is in.telnetd, then kill it.
if ps -ef | grep $first_parent_of_first_dead_kid | grep in.telnetd;then
        echo "We have a defunct process whose parent process is in.telnetd" | logger -t KILL-DEFUNCT-TELNET
        echo "killing $first_parent_of_first_dead_kid" | logger -t KILL-DEFUNCT-TELNET
        kill $first_parent_of_first_dead_kid 2>&1 | logger -t KILL-DEFUNCT-TELNET
fi
0

I had a libreoffice application zombie hanging with and 'Z'. and its PPID was 1 (because I had killed the calling application oosplash after LibreOffice wasn't responding anymore). However, no files were in use by the zombie. Still, the office icons were still on my desktop/taskbar.

I have managed to get the application (or process ID) cleaned up and being able to start libreoffice again without the need for a reboot:

First, I removed the .lock file in the ~/.config/libreoffice/ directory. (don't know if it is relevant, but just for completeness of what I did). Next I started libreoffice on another host (so under same login). Office started normally on this new host. At the same time: all the 'hanging' libreoffice icons were gone. Once 'Document recovery' was complete, I had closed all open documents and exited libreoffice on this new host. (Please note that the document files are on a fileserver that is accessible to both hosts.)

I restarted libreoffice on the original host and that went without problems and no need to reboot.

Maybe the recipe doesn't work all the time, maybe a coincidence with my host cleaning up the orphan process at precisely the same time, but I was lucky to avoid the reboot. Maybe it helps others as well. just give it a try.

Good luck!

Niels
  • 1