When Linux Runs Out of Memory
Subject:   kernel memory
Date:   2007-07-13 04:52:51
From:   mulyadi_santosa
Response to: kernel memory

OK, to answer your first question. "Kernel trusts itself" means the kernel won't do any complicated check when it asks for memory. For example, you ask for 256 MB memory block (using kmalloc(), kernel-space version of malloc()). Then the allocator will give it to you if there are such amount of free pages. No allocation delay at all. Another example, you can allocate a big chunk and forget not to free it later. There isn't any garbage collector exists in kernel land, so this chunk will still marked as used until the end of life of the kernel.

Now the second question, could the kernel over-allocate? In practice, no. What you see as overcommit action actually just exists in user space. Recall that the actual page allocation only happens in the page fault (be it soft or hard one, "hard" means data must be read from backing storage). In kernel space, when you ask for RAM pages, you will either get them all at once or get nothing (in case of low free pages or heavy fragmented memory).

About the policy, I can't recall anything specific here. I just remember that in each zone (dma, normal and highmem), some % of free pages are reserved. No user mode allocation is allowed to drain this reserved pages, unless its effective user ID is root. Another policy that I could recall is the way the allocator prioritize the zones. IIRC, first it tries to grab pages from highmem zone, then normal. As the last resort, it will try DMA zone.

About the importance of swap, this is kinda subjective answer. Theoritically, you won't need swap if you own very big RAM, let's say 64GB RAM (it can be addressed in 32 bit using PAE mode). But that's rare. Nowadays, most PC owns 256 MB - 2GB RAM. Sure it's big, but the applications also grow bigger too and consumes more RAM. So, 2GB is likely eaten fast in certain workloads. If you don't own swap, once that 2GB is used, you're out of luck. No more allocation is possible. Swap is acting as life saver here, allows you to allocates a bit more without being rejected. It also permits the kernel to swap out inactive pages, so RAM pagea are freed up for more important jobs.

Does this clear your doubts?