-------------------------------------------------------------------------------- Removing Zombies, Nico Schottelius 2005-06-15 (Last Modified: 2005-06-15) -------------------------------------------------------------------------------- First of all, the definition of a zombie: ''Defunct ("zombie") process, terminated but not reaped by its parent.'' [Excerpt from ps(1)] ''In the case of a terminated child, performing a wait allows the system to release the resources associated with the child; if a wait is not performed, then terminated the child remains in a "zombie" state (see NOTES below). [...] NOTES A child that terminates, but has not been waited for becomes a "zom- bie". The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes. If a parent process terminates, then its "zombie" children (if any) are adopted by init(8), which automati- cally performs a wait to remove the zombies.'' [Excerpt from waitpid(2)] So you can see, that if some process misbehaves and forgets about its children, we, cinit, will adopt it. In the first versions of cinit (cinit-0.0.1 <-> cinit-0.0.7) we ignored SIGCHLD. This way the zombies stayed in the system. Since cinit-0.0.8 we have sig_chld(), which removes the zombies. So far so good. Now let's remove this feature. Yes, you heard right. If cinit does not catch the zombies they will stay in the system. Yes, that's ugly. And that's how it should be. You should see, which software is broken and should contact the authors to fix it. To restore the 'old' behaviour of cinit, change generic/set_signals.c (ignore SIGCHLD) Makefile: remove serv/sig_child.c from modules list cinit in general will keep the behaviour of reaping the vestiges, which broken software left behind (as it is cleaner for the system to deallocate unused ressources).