Linux memory management subsystem is responsible, as the name implies, for managing the memory in the system. This includes implementation of virtual memory and demand paging, memory allocation both for kernel internal structures and user space programs, mapping of files into processes address space and many other cool things.[1]
One unit of memory for the linux kernel. usefull infos:[4]
If data is written, it is first written to the Page Cache and managed as one of its _dirty pages_. _Dirty_ means that the data is stored in the Page Cache, but needs to be written to the underlying storage device first. Writeback throttling is the act of delaying the write to storage device as to not overload it, enable by default.[5]
A page fault is an exception raised by the memory management unit that happens when a process needs to access data within its address space, it fails to load in the physical memory.
Minor page fault:
1. HW error that prevent reading like bad hdd sector,
2. Shared memory has already loaded page into physical memory
Major page fault:
1. Attempt to read/write an address that you don't have the permission to read/write[6]
System to allocate Huge pages automatically, the only other way is to ask the kernel directly for it.
STD size for page 4K
possible size for Huge Pages: 2M, 4M, 1G (Depend on CPU)
Made to optimize the use of TLB.
Has been added Linux 2.6.38 [7]
mTHP Multi-size THP give the ability to declare huge pages that are:
1. Bigger than a simple page (4K)
2. Smaller than a STD Huge page
3. A power of 2
With this in place every memory access requires a lookup (performed directly in the _Memory Management Unit_ (MMU) in the CPU) from virtual to physical address. The page table can become large and is usually a multi-layered data structure. Performing a lookup in this large table for every single memory access would be prohibitively slow. Therefore, the MMU keeps a _Translation Lookaside Buffer_ (TLB), which is essentially a cache of recently-used entries from the page tables. Modern CPUs usually have a multi-level TLB (similar as with data caches), so one can’t simply state a size of “the TLB”. As an example: the top-level data TLB in a Skylake CPU has 64 entries.1 Thus, memory from the 64 last-accessed pages is readily available, all other memory accesses will either fall back to a lower-level TLB cache, or in the worst case have the MMU traverse the large page table structure.[10] [11]
On an address-space switch, as occurs when #Context switch between processes (but not between threads), some TLB entries can become invalid, since the virtual-to-physical mapping is different. The simplest strategy to deal with this is to completely flush the TLB. This means that after a switch, the TLB is empty, and any memory reference will be a miss, so it will be some time before things are running back at full speed.
the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state.[21]
The page table is the way for the kernel to know where in (or in which) memory a virtual address is. Here is how it works
Lets say that we request a page that should already contain data:
1. Our CPU will check the #Translation Lookaside Buffer (TLB)
2. If it failed, it will check the page table where:
The memory may by in Ram or disk or even somewhere else. the job of a given entry is to point to where that data is and "Translate" the virtual address into a physical address.
An entry inside of #Page Table
Memory associated with program data and not backed/mapped by a file on disk.[12]
Are a new type in memory management, they are used _only_ in Anonymous pages and FS (file memory) They are:
1. Bigger than pages
2. A power of 2
3. Special construct that make sure you are not hitting a tail page (no clue how that works)[13][14]
has been added Linux 5.16 [15]
Most benchmarks of folios put the performance benefit in the 0~10% region.[16]
SLAB allocators are a type memory allocators use for smaller than page objects that aims to:
- reduce fragmentation
- optimize through caching common object
- align with Cache CPU cache[2]
once upon a time there was 3 SLAB alloc in linux:
- SLOB, Dropped in 6.4
- SLAB, Dropped in 6.8
- SLUB the last one remaining[3]
The system in charge of the slab allocator. [17]
The kernel uses virtual memory areas to keep track of the process's memory mappings; for example, a process has one VMA for its code, one VMA for each type of data, one VMA for each distinct memory mapping (if any), and so on. VMAs are processor-independent structures, with permissions and access control flags. Each VMA has a start address, a length, and their sizes are always a multiple of the page size (PAGE_SIZE). A VMA consists of a number of pages, each of which has an entry in the page table.[19]
Like all other architectures, x86_64 has a kernel stack for every active thread. These thread stacks are THREAD_SIZE (4*PAGE_SIZE)=16KB big. These stacks contain useful data as long as a thread is alive or a zombie. While the thread is in user space the kernel stack is empty except for the thread_info structure at the bottom.[18]
a file descriptor (FD, less frequently fildes) is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket.[20]