====== Что такое OOM и oom-killer Out of memomy, killer, oom-killer ======
{{tag>oom out of memomy killer oom-killer}}
===== Теория =====
* https://lwn.net/Articles/666548/
* [[http://catap.ru/blog/2009/05/05/about-memory-overcommit-memory/|Про память: overcommit memory]]
* [[http://avz.org.ua/wp/2011/04/24/overcommit-memory/|Былина о memory overcommit]]
* http://unix.stackexchange.com/questions/87732/linux-reboot-out-of-memory
===== Практика =====
* https://www.debuntu.org/how-to-reboot-on-oom/
* https://kura.io/2011/10/25/rebooting-on-oom/
* [[http://www.oracle.com/technetwork/articles/servers-storage-dev/oom-killer-1911807.html|How to Configure the Linux Out-of-Memory Killer]]
* [[http://www.hskupin.info/2010/06/17/how-to-fix-the-oom-killer-crashe-under-linux/|How to fix the OOM killer crashes under Linux]]
* https://www.kernel.org/doc/Documentation/sysctl/kernel.txt search panic.
OOM, Out of memomy, killer, oom-killer
* [[http://unx.stackexchange.com/questions/128642/debug-out-of-memory-with-var-log-messages|Debug out-of-memory with /var/log/messages]]
* [[http://askubuntu.com/questions/566745/allocate-swap-after-ubuntu-14-04-lts-installation|Allocate swap after Ubuntu 14.04 LTS installation]]
* [[http://serverfault.com/questions/449296/why-is-linux-reporting-free-memory-strangely|Why is Linux reporting "free" memory strangely?]]
* [[http://unixforum.org/index.php?showtopic=82536|Распределение памяти ядром Linux]]
* [[http://markelov.blogspot.ru/2009/01/linux-procmeminfo.html|И снова о памяти в Linux - /proc/meminfo]]
* http://unix.stackexchange.com/questions/14102/real-memory-usage
* [[http://www.linuxhowtos.org/System/Linux%20Memory%20Management.htm|Overview of memory management]]
* [[http://unix.stackexchange.com/questions/97261/how-much-ram-does-the-kernel-use|How much RAM does the kernel use?]]
==== Перезагрузка при OOM ====
Мы можем настроить систему таким образом, чтобы при OOM у нас был kernel panic и система автоматически перезагружалась.
Скопирую часть из документации с сайта [[https://www.kernel.org/doc/Documentation/sysctl/vm.txt|kernel.org]]
panic_on_oom
This enables or disables panic on out-of-memory feature.
If this is set to 0, the kernel will kill some rogue process,
called oom_killer. Usually, oom_killer can kill rogue processes and
system will survive.
If this is set to 1, the kernel panics when out-of-memory happens.
However, if a process limits using nodes by mempolicy/cpusets,
and those nodes become memory exhaustion status, one process
may be killed by oom-killer. No panic occurs in this case.
Because other nodes' memory may be free. This means system total status
may be not fatal yet.
If this is set to 2, the kernel panics compulsorily even on the
above-mentioned. Even oom happens under memory cgroup, the whole
system panics.
The default value is 0.
1 and 2 are for failover of clustering. Please select either
according to your policy of failover.
panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.
Устанавливаем параметры для ядра через sysctl
# sysctl vm.panic_on_oom = 1
# sysctl kernel.panic = 30 # время ожидания в секундах до перезагрузки
Или чтобы настройки остались после перезагрузки вносим их в файл sysctl
echo "vm.panic_on_oom = 1" >> /etc/sysctl.conf
echo "kernel.panic = 30" >> /etc/sysctl.conf
==== Тест panic_on_oom ====
Теперь осталось вызвать Пиковую даму OOM.
В этом вопросе меня [[https://kura.github.io/2011/10/25/rebooting-on-oom/|выручил блог]].
Надо собрать простую программу на С
#include
#include
#include
#define MB 10485760
int main(int argc, char *argv[]) {
void *b = NULL;
int c = 0;
while(1) {
b = (void *) malloc(MB);
if (!b) {
break;
}
memset(b, 1, MB);
printf("Allocating %d MB\n", (++c * 10));
}
exit(0);
}
Компилятор, сборка, запуск
# yum install gcc
# gcc -O2 oom.c -o oom
# ./oom
Allocating 10 MB
Allocating 20 MB
...
Allocating 7500 MB
Allocating 7510 MB
Killed
Система ушла в ребут.
Запись в syslog
Mar 12 16:51:25 shisp1 kernel: in:imjournal invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Mar 12 16:51:25 shisp1 kernel: [] oom_kill_process+0x254/0x3d0
Mar 12 16:51:25 shisp1 kernel: [] ? oom_unkillable_task+0xcd/0x120
Mar 12 16:51:25 shisp1 kernel: Out of memory: Kill process 7112 (oom) score 904 or sacrifice child
Mar 12 16:51:25 shisp1 kernel: Killed process 7112 (oom) total-vm:7707656kB, anon-rss:7694760kB, file-rss:20kB, shmem-rss:0kB
EOM