====== Что такое OOM и oom-killer Out of memomy, killer, oom-killer ====== {{tag>oom out of memomy killer oom-killer}} ===== Теория ===== * https://lwn.net/Articles/666548/ * [[http://catap.ru/blog/2009/05/05/about-memory-overcommit-memory/|Про память: overcommit memory]] * [[http://avz.org.ua/wp/2011/04/24/overcommit-memory/|Былина о memory overcommit]] * http://unix.stackexchange.com/questions/87732/linux-reboot-out-of-memory ===== Практика ===== * https://www.debuntu.org/how-to-reboot-on-oom/ * https://kura.io/2011/10/25/rebooting-on-oom/ * [[http://www.oracle.com/technetwork/articles/servers-storage-dev/oom-killer-1911807.html|How to Configure the Linux Out-of-Memory Killer]] * [[http://www.hskupin.info/2010/06/17/how-to-fix-the-oom-killer-crashe-under-linux/|How to fix the OOM killer crashes under Linux]] * https://www.kernel.org/doc/Documentation/sysctl/kernel.txt search panic. OOM, Out of memomy, killer, oom-killer * [[http://unx.stackexchange.com/questions/128642/debug-out-of-memory-with-var-log-messages|Debug out-of-memory with /var/log/messages]] * [[http://askubuntu.com/questions/566745/allocate-swap-after-ubuntu-14-04-lts-installation|Allocate swap after Ubuntu 14.04 LTS installation]] * [[http://serverfault.com/questions/449296/why-is-linux-reporting-free-memory-strangely|Why is Linux reporting "free" memory strangely?]] * [[http://unixforum.org/index.php?showtopic=82536|Распределение памяти ядром Linux]] * [[http://markelov.blogspot.ru/2009/01/linux-procmeminfo.html|И снова о памяти в Linux - /proc/meminfo]] * http://unix.stackexchange.com/questions/14102/real-memory-usage * [[http://www.linuxhowtos.org/System/Linux%20Memory%20Management.htm|Overview of memory management]] * [[http://unix.stackexchange.com/questions/97261/how-much-ram-does-the-kernel-use|How much RAM does the kernel use?]] ==== Перезагрузка при OOM ==== Мы можем настроить систему таким образом, чтобы при OOM у нас был kernel panic и система автоматически перезагружалась. Скопирую часть из документации с сайта [[https://www.kernel.org/doc/Documentation/sysctl/vm.txt|kernel.org]] panic_on_oom This enables or disables panic on out-of-memory feature. If this is set to 0, the kernel will kill some rogue process, called oom_killer. Usually, oom_killer can kill rogue processes and system will survive. If this is set to 1, the kernel panics when out-of-memory happens. However, if a process limits using nodes by mempolicy/cpusets, and those nodes become memory exhaustion status, one process may be killed by oom-killer. No panic occurs in this case. Because other nodes' memory may be free. This means system total status may be not fatal yet. If this is set to 2, the kernel panics compulsorily even on the above-mentioned. Even oom happens under memory cgroup, the whole system panics. The default value is 0. 1 and 2 are for failover of clustering. Please select either according to your policy of failover. panic_on_oom=2+kdump gives you very strong tool to investigate why oom happens. You can get snapshot. Устанавливаем параметры для ядра через sysctl # sysctl vm.panic_on_oom = 1 # sysctl kernel.panic = 30 # время ожидания в секундах до перезагрузки Или чтобы настройки остались после перезагрузки вносим их в файл sysctl echo "vm.panic_on_oom = 1" >> /etc/sysctl.conf echo "kernel.panic = 30" >> /etc/sysctl.conf ==== Тест panic_on_oom ==== Теперь осталось вызвать Пиковую даму OOM. В этом вопросе меня [[https://kura.github.io/2011/10/25/rebooting-on-oom/|выручил блог]]. Надо собрать простую программу на С #include #include #include #define MB 10485760 int main(int argc, char *argv[]) { void *b = NULL; int c = 0; while(1) { b = (void *) malloc(MB); if (!b) { break; } memset(b, 1, MB); printf("Allocating %d MB\n", (++c * 10)); } exit(0); } Компилятор, сборка, запуск # yum install gcc # gcc -O2 oom.c -o oom # ./oom Allocating 10 MB Allocating 20 MB ... Allocating 7500 MB Allocating 7510 MB Killed Система ушла в ребут. Запись в syslog Mar 12 16:51:25 shisp1 kernel: in:imjournal invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 Mar 12 16:51:25 shisp1 kernel: [] oom_kill_process+0x254/0x3d0 Mar 12 16:51:25 shisp1 kernel: [] ? oom_unkillable_task+0xcd/0x120 Mar 12 16:51:25 shisp1 kernel: Out of memory: Kill process 7112 (oom) score 904 or sacrifice child Mar 12 16:51:25 shisp1 kernel: Killed process 7112 (oom) total-vm:7707656kB, anon-rss:7694760kB, file-rss:20kB, shmem-rss:0kB EOM