How to deal with out of memory issues on multi user systems?

lumbric asked:

When a process tries to allocate more memory than available, it is usually very hard to handle this situation in a good way. In some cases programs might be able to free non-essential memory (e.g. caches), but it most cases it will lead to fatal exceptions (see also [1]), to swapping and massive system slow down or even to kill (pretty arbitrary) processes by using the OOM killer.

On a single user system all options are bad but at least you can only hurt yourself (after all you are responsible for all code you run on your machine).

On a multi user system however, I’d expect that it might be possible to make the OOM killer kill certain processes possibly owned by other users. There are certain restrictions, because the OOM killer tries to find a “process that is using a large amount of memory but is not that long lived” (see here).

Nevertheless imaging a situation where a process requires x MB of memory and has not lived for very long time (low CPU time) and is owned by user X and x is significantly more than there is free memory on the system, user Y might be able to allocate more memory than is available and cause the system to kill X’s process (because it is using more memory).

This situation sounds way more scary than similar implications on a single user machine. One could set memory limitations per user or even use containers to separate processes even more. But as far as I understand only setting the limit to total_memory / number_of_users would solve the problem. But when setting such a limit one looses all benefits from a multi user system. The machine actually is then similar to multiple single user machines in one box. One typically wants to allow one user to use more memory at peak time, because most of the time users will need less memory than average.

I’m mostly interested solving this issue in situations with large calculations with huge amounts of data. I guess in case of web servers one might be able to estimate better how much memory is need as there are many small operations and not few large ones. But even in this case, I’ve heard that in normal situations only 50% of your memory should be filled to avoid out of memory issues during peaks. Isn’t this a waste of 50% of your memory?

In particular I am thinking to host a Jupyter hub or something similar. But I don’t want the users to kill each others processes.

Is there any other solution to mitigate this problem? How do huge cloud providers like Travis CI deal with this kind of problems?

My answer:

You can set the sysctl vm.oom_kill_allocating_task. When this is set, and a process memory request would lead to running out of memory, Linux will kill that process.

From the documentation:

This enables or disables killing the OOM-triggering task in
out-of-memory situations.

If this is set to zero, the OOM killer will scan through the entire
tasklist and select a task based on heuristics to kill. This normally
selects a rogue memory-hogging task that frees up a large amount of
memory when killed.

If this is set to non-zero, the OOM killer simply kills the task that
triggered the out-of-memory condition. This avoids the expensive
tasklist scan.

If panic_on_oom is selected, it takes precedence over whatever value
is used in oom_kill_allocating_task.

The default value is 0.

View the full question and any other answers on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.