OOM killer not working?

Question

For what I understand, when the system is close to have no free memory, the kernel should start to kill processes to regain some memory. But in my system this does not happen at all.

Suppose a simple script that just allocates much more memory than the available in the system (an array with millions of strings, for example). If I run a script like this (as a normal user), it just gets all the memory until the system completely freezes (only SysRQ REISUB works).

The weird part here is that when the computer freezes, the hard drive led turns on and stays that way until the computer is rebooted, either if I have a swap partition mounted or not!

So my questions are:

Is this behavior normal? It's odd that an application executed as a normal user can just crash the system this way...
Is there any way I can make Ubuntu just kill automatically those applications when they get too much (or the most) memory?

Additional information

Ubuntu 12.04.3
Kernel 3.5.0-44

RAM: ~3.7GB from 4GB (shared with graphics card). *

$ tail -n+1 /proc/sys/vm/overcommit_*
==> /proc/sys/vm/overcommit_memory <==
0

==> /proc/sys/vm/overcommit_ratio <==
50

$ cat /proc/swaps
Filename                Type        Size    Used    Priority
/dev/dm-1                               partition   4194300 344696  -1

I'm not sure why it's not working. Try tail -n+1 /proc/sys/vm/overcommit_* and add the output. See here also: How Do I configure oom-killer — kiri, Jan 02 '14 at 22:48
So what is happening with your swap space? Can you post some vmstat output like #vmstat 1 100 or something like that? and also show us cat /etc/fstab
What should happen is at a certain amount of memory usage, you should start writing to swap. Killing processes shouldn't happen until memory and swap space are "full". — j0h, Jan 04 '14 at 21:44
@j0h With swap it seems to work well (after some time the process crashed with something like Allocation failed). But without swap it just freezes the computer. It is supposed to work this way (only kill when using swap)? — Salem, Jan 04 '14 at 22:24
PS: I'm trying to go without swap space because my disk is very very slow, so when something starts swapping my PC gets frozen — Salem, Jan 04 '14 at 22:24
Have a look here It is about configuring OOMKiller:http://lwn.net/Articles/317814/ — j0h, Jan 05 '14 at 01:00
Alternatively, perhaps you can put a kill process if low memory option into this script that tanks your memory. As an example: http://pastebin.ubuntu.com/6694682/ — j0h, Jan 05 '14 at 02:46
@Salem please try echo 1 | sudo tee /proc/sys/vm/oom_kill_allocating_task without a mounted swap. It will be undone after a reboot. — admirabilis, Jan 05 '14 at 15:32
@TeresaeJunior Can you write that in a answer (also how to make it permanent)? It does not completely fix it, but I guess it's the best I can get with this hardware... — Salem, Jan 09 '14 at 09:27

score 50 · Accepted Answer · answered Jan 09 '14 at 15:42

From the official /proc/sys/vm/* documentation:

oom_kill_allocating_task

This enables or disables killing the OOM-triggering task in out-of-memory situations.

If this is set to zero, the OOM killer will scan through the entire tasklist and select a task based on heuristics to kill. This normally selects a rogue memory-hogging task that frees up a large amount of memory when killed.

If this is set to non-zero, the OOM killer simply kills the task that triggered the out-of-memory condition. This avoids the expensive tasklist scan.

If panic_on_oom is selected, it takes precedence over whatever value is used in oom_kill_allocating_task.

The default value is 0.

In order to summarize, when setting oom_kill_allocating_task to 1, instead of scanning your system looking for processes to kill, which is an expensive and slow task, the kernel will just kill the process that caused the system to get out of memory.

From my own experiences, when a OOM is triggered, the kernel has no more "strength" enough left to do such scan, making the system totally unusable.

Also, it would be more obvious just killing the task that caused the problem, so I fail to understand why it is set to 0 by default.

For testing, you can just write to the proper pseudo-file in /proc/sys/vm/, which will be undone on the next reboot:

echo 1 | sudo tee /proc/sys/vm/oom_kill_allocating_task

For a permanent fix, write the following to /etc/sysctl.conf or to a new file under /etc/sysctl.d/, with a .conf extension (/etc/sysctl.d/local.conf for example):

vm.oom_kill_allocating_task = 1

Was it always set to 0 in Ubuntu? Because I remember it used to kill automatically, but since a few versions it stopped doing so. — Jelle De Loecker, Jun 12 '14 at 20:18
@skerit This I don't really know, but it was set to 0 in the kernels I used back in 2010 (Debian, Liquorix and GRML). — admirabilis, Jun 13 '14 at 03:37
"Also, it would be more obvious just killing the task that caused the problem, so I fail to understand why it is set to 0 by default." - because the process that requested memory isn't necessarily the one that "caused the problem". If process A hogs 99% of the system's memory, but process B, which is using 0.9%, happens to be the one that triggers the OOM killer by bad luck, B didn't "cause the problem" and it makes no sense to kill B. Having that as the policy risks totally unproblematic low-memory processes being killed by chance because of a different process's runaway memory usage. — Mark Amery, Sep 11 '18 at 09:53
@MarkAmery The real problem is that Linux, instead of just killing the needed process, starts thrashing like a retard, even if vm.admin_reserve_kbytes is increased to, say, 128 MB. Setting vm.oom_kill_allocating_task = 1 seems to alleviate the problem, doesn't really solve it (and Ubuntu already deals with fork bombs by default). — admirabilis, Sep 11 '18 at 19:39
Maybe more elegant sudo sysctl -w vm.oom_kill_allocating_task=1 — Pablo Bianchi, Feb 27 '19 at 16:03

score 12 · Answer 2 · edited Feb 27 '19 at 16:01

12

Update: The bug is fixed.

Teresa's answer is enough to workaround the problem and is good.

Additionally, I've filed a bug report because that is definitely a broken behavior.

edited Feb 27 '19 at 16:01

Pablo Bianchi

15,657

answered Aug 21 '14 at 14:08

int_ua

8,574

2

I don't know why you got downvoted, but that also sounds like a kernel bug to me. I've crashed a big university server today with it and killed some processes that were running for weeks... Thanks for filing that bug report though! – shapecatcher Dec 16 '14 at 14:26
13

Might have been fixed in 2014, in 2018 (and 18.04) the OOM killer is yet again doing nothing. – Jelle De Loecker May 22 '18 at 16:00
1

yeah also 21.04 broken. I hate ubuntu now – france1 Aug 23 '21 at 18:38
1

@france1 "This doesn't hang indefenitely though, sometimes the system is able to recover after a couple of minutes.", this is what I observerd in Ubuntu 20.04.4 LTS desktop, kernel version 5.15.0-46-generic. But on a VPS of Linux VM-32-17-ubuntu 5.4.0-121-generic and Ubuntu 20.04 LTS, for the same test it didn't recover. – Rick Sep 10 '22 at 04:34

score 6 · Answer 3 · answered Jun 19 '19 at 23:30

6

You can try earlyoom, an OOM killer that operates in user space and tries to kill the largest process in an OOM situation.

answered Jun 19 '19 at 23:30

qwr

2,802

1

earlyoom works like a charm – Rick Sep 11 '22 at 05:01

score -2 · Answer 4 · answered Jan 09 '14 at 14:02

-2

First of all I recommend the update to 13.10 (clean install, save your data).

If you don't want to update change the vm.swappiness to 10 and if you find problems with your ram install zRAM.

answered Jan 09 '14 at 14:02

Brask

1,588

2

I wasn't the one who downvoted you, but generally, lowering vm.swappiness does more harm than good, even more on systems suffering from low memory issues. – admirabilis Jan 09 '14 at 15:46
Not when you compress the ram first and you then avoid disk use that is much slower and can be making your computer freeze. – Brask Jan 09 '14 at 16:02
In theory, zRAM is a nice thing, but it is CPU hungry, and generally not worth the cost. Memory is generally way cheaper than electricity. And, on a laptop, where upgrading the RAM is more expensive, CPU usage is mostly undesirable. – admirabilis Jan 09 '14 at 16:18
What he is asking for is to have a more stable system zRAM and changing swappiness will make his system use more CPU resource yes, but what he is limited atm and having errors with is the memory, he wants to fix the problem not a theory lesson of what happens when you install zRAM. – Brask Jan 09 '14 at 16:22
It's clear from his question that he may write an improper script that eats more than it should (and I have already done this myself). In a situation like this, you can watch the script grabbing gigabytes of RAM in a few seconds, and zRAM won't come to the rescue, since the script will never be satisfied enough. – admirabilis Jan 09 '14 at 16:28
But reducing the disk use and compressing the ram as much as possible will help out greatly :) – Brask Jan 09 '14 at 16:30
Reducing disk use and compressing ram will only prolong the problem. You can still have an OOM situation and the kernel should handle that gracefully. It doesn't and it turns out this was caused by a kernel bug (for which setting vm.oom_kill_allocating_task = 1 is a workaround) – shapecatcher Dec 16 '14 at 14:30

OOM killer not working?

Additional information

4 Answers4

Linked