Skip to content

Memory vm_area_struct usage

ccotter edited this page Sep 16, 2012 · 4 revisions

Copying via COW

Copying memory via the copy-on-write technique is simple - just copy page table entries and mark the pages read only. The page fault handler copies write pages when the user process tries to write to the page in question.

However, Linux's reverse map (rmap) code restricts how anonymous memory pages can be mapped. An anonymous page's struct page.index field is set to the page index offset from the enclosed vm_area_struct. When fork() copies pages via COW, this is fine since the child process's vm_area_structs are identical to its parent's.

The dput() copy operation must be careful to only allow copying entire vm_area_struct regions. The start and end address must align to a vm_area_struct - otherwise the rmap code will no longer work.

Implications

In order to support copying arbitrary memory regions, dput's COPY works by considering three regions. Suppose we want to copy [addr, end). Put lowest=LOWER_PAGE(addr), start_page=PAGE_ALIGN(addr), end_page=LOWER_PAGE(end), highest=PAGE_ALIGN(end), where LOWER_PAGE(x) returns x rounded down to the nearest page aligned address, and PAGE_ALIGN(x) returns x rounded up to the nearest page aligned address. For example, LOWER_PAGE(0x1001)=LOWER_PAGE(0x1000)=0x1000.

A COPY operation might split the source's VMA list four times. We consider the following partition:

[lowest, addr) U [addr, start_page) U [start_page, end_page) U [end_page, end) U [end, highest)

The first and last regions must NOT be altered in the destination. If these regions are not covered by a previous mapping, then a side effect of copying will be that these regions will be zero-ed out due to PAGE_SIZE being the finest granularity of mapping in Linux. We don't COW these regions since we would risk copying too many bytes.

In order to maintain the correctness of the rmap code, we might split at lowest, start_page, end_page, and highest. The requirement before we can COW is the source has a VMA that starts/ends at these addresses.

Once any necessary splitting has been done, the second, third, and fourth regions are examined. The second and fourth are copied manually using memcpy(), but the third region (which is page aligned) is copied using copy-on-write. The code is identical to the code that fork() uses to copy VMAs.

Links

The case of the overly anonymous anon_vma Virtual Memory II: the return of objrmap

Clone this wiki locally