In the (very) old days, computer where running using only physical memory, by that i mean that all process on a computer shared the same address space and this address space was calling upon very real memory modules.
This meant that processes needed to be mindful of other running process on the system or they could write on each others memory.
This also meant that there was no permission on any memory, nothing was marked as read only or not executable.
Memory also couldn't be swapped...
With these limitation, big brains set out this fix this issue. The solution? Virtual memory.
Virtual memory is a construct created at runtime over physical memory. With virtual memory, each process gets their own personal virtual memory address space.
This is done by splitting the physical memory in (in the case of most x86_64) 4KB block of memory called pages.
Each 4KB chunk of memory has its page (struct page) and it can than be referenced from the virtual address space of a process.
In that system, a page point to it's page frame.
To make that translation from virtual to physical, the CPU has a MMU (Memory Management Unit) that takes in pages and spits out physical address to get the underlying data.
Now lets take a look at how it looks from the CPU point of view!
Lets take a process running on a CPU and requesting a specific memory address.
The first thing that the CPU will do is to ask the Translation Lookaside Buffer (TLB) if it has the page for the underlying memory.
If it does, it can get the underlying data.
If it doesn't, it will need to ask the kernel to fetch it through a page fault handler.
Keep in mind, this happens often, your modern CPU has a TLB with about 64 entries per core.
So there is 2 ways the kernel will be able to add an entry to the TLB
a PTE can be swapped out by setting its present bit to 0 and keeping the other bit around.[2]
The address contained within the PTE will be replaced by an index linked to a kernel constructs (swap_info and swap_map) which will then point to the location of the swapped page data [1]
VMAs allow to link chunks of contiguous memory to a file or null.
When requesting memory, the kernel will create a VMA instead of creating all the individual pages
A VMA is 3 things:
The struct page are 3 things:
Note that the address of a physical chunk of memory is not inside the struct page but it can be calculated from the address/index of the page.[3] [4]
At boot time, there is 2 possible behavior(Physical Memory Model) depending on architecture:
PS on a 4KB system you can find the memory location by taking your index in mem_map and shifting it by 12 bits (PAGE_SHIFT)
or (for NUMA compatible systems, including X86_64)
This struct page is also a fairly large struct, around 64 BYTES. This means that at any time, 1.5% of your memory is used only for the struct page.
Sometime, you don't want to individually manage every page requested, especially when you will be using big amount of them and will use them as one singular contiguous chunk of memory.
Here is how it works:
When using compound pages, the first page will have page flag PG_head (head) and the others will be tails. All the tails point to the head tail meaning that the attributes of the first page is applied to the whole group of compound page.
This is used for huge pages as they are a contiguous unified chunk of memory[5]
It is also used for:
Folios are the solution for the management of the compound pages. If you were to edit a tail page without checking if it is a tail, you could end up corrupting some data. To make the use of compound pages safe, Folios make sure that you don't read/write a tail page. TLDR Folios is your gateway to using compound page in a safe way.[6]
Virtual memory also allows for one last useful trick, as the address space is so big (we could technically address up to 16EB with 64 bits addressing, Normal x86_64 is limited to 48bits addressing so we get 256TB) we can put some extra features after the process' address space. in fact the process' address space is _only_ 128TB. The remaining address space is for the kernel to use, and part of it is used for up to 64TB of direct memory mapping.[13]
this is based [12]
source for:
all the struct[9]
the bits on a PTE [10]
more detail on struct page [11]
For the truly insane that likes things to be complicated, look around page 113[14]
: https://www.kernel.org/doc/gorman/html/understand/understand014.html
: https://wxdublin.gitbooks.io/deep-into-linux-and-beyond/content/vm_subsystem.html
: https://blogs.oracle.com/linux/post/struct-page-the-linux-physical-page-frame-data-structure
: https://github.com/lorenzo-stoakes/linux-vm-notes/blob/master/sections/page-tables.md#translating-between-page-table-entries-and-physical-page-descriptors
: https://lwn.net/Articles/619514/
: https://kernelnewbies.org/MatthewWilcox/Folios
: https://elixir.bootlin.com/linux/v6.15/source/include/linux/mm_types.h
: https://elixir.bootlin.com/linux/v6.15/source/arch/x86/include/asm/pgtable_types.h
: https://litux.nl/mirror/kerneldevelopment/0672327201/ch11lev1sec1.html
: https://grimoire.carcano.ch/blog/memory-management-the-buddy-allocator/
: https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt
: https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf