Skip to content

Procedure calls in migration point

bhatnarf edited this page Jan 29, 2019 · 9 revisions

Cross-machine process migration happens when the process in origin node issues a popcorn_migrate(int nid, void __user *uregs) system call. In order to clearly express the relationship of different machines, we use the terms origin (master) for the machine node that initiates the migration, and the term remote (slave) for the destination nodes for a migration.

(Although people thought slave is kind of an offensive expression, the term remote sometimes would be confusing when describing something 'remote of remote'.)

On origin (master) side:

Okay, let's start at the very beginning: the migrate library call. In popcorn-compile project, we can see the definition of library function migrate or check_migrate in lib/migration/src/migrate.c. The migrate library call will call __migrate_shim_internal, which will get the register information and do the system call SYS_sched_migrate (330). This will cause a trap to the kernel (we will describe it later), and after system call returns (a back migration), the library restores the program state and resumes process execution.

The popcorn kernel defines the syscall handler for vector 330 with SYSCALL_DEFINE2(popcorn_migrate, int, nid, void __user *, uregs). It checks the node id and involves the process server routine process_server_do_migration, and that routine will call the important routine __do_migration(task, dst_nid, uregs).

Let's take a look at the __do_migration function. It allocates the struct remote_context, and sends a message to remote to clone the process execution on the remote node (via __request_clone_remote()). Basically, it involves the following two important functions:

  1. The __request_clone_remote() function actually sends a message to remote nodes. Specifically, it defines a clone_request_t *req request, assigns the process's memory information (e.g., code, data memory addresses) to this request, and sends the request with pcn_kmsg_post(PCN_KMSG_TYPE_TASK_MIGRATE, dst_nid, req, sizeof(*req)). (We will have another wiki page to describe the popcorn message layer.)

  2. __process_remote_works(): This is a big while loop waiting for a completion to complete. For example, a back migration with message type PCN_KMSG_TYPE_TASK_MIGRATE_BACK will be captured by the origin message handler handle_back_migration() defined by DEFINE_KMSG_RW_HANDLER(back_migration, ...). The handler calls request_remote_work(pid, req) to complete the waiting completion.

In short, the origin node has the call graph as follows:

/* kernel/sched/core.c */
SYSCALL_DEFINE2(popcorn_migrate, int, nid, void __user *, uregs)
    call process_server_do_migration(current, nid, uregs)
        |
        V
        process_server_do_migration(struct task_struct *tsk, unsigned int dest_nid, void __user *uregs)
            /* verify if this is a forward migration */
            call __do_migration(tsk, dst_nid, uregs);
                |
                V
                __request_clone_remote(dst_nid, tsk, uregs);
                    prepare pcn msg request (exe_path, task info)  /* This involves the popcorn msg layer */
                    /* Sends a PCN_KMSG_TYPE_TASK_MIGRATE type message to remote's process_server */
                    pcn_kmsg_post(PCN_KMSG_TYPE_TASK_MIGRATE, dst_nid, req, sizeof(*req));

                __process_remote_works();
                    while (run) {
                        wait_for_completion_interruptible_timeout(...)
                        switch (req->header.type)
                            case ...
                    }

On remote (slave) side:

The process server registered a popcorn message handler with REGISTER_KMSG_HANDLER(PCN_KMSG_TYPE_TASK_MIGRATE, clone_request). So when PCN_KMSG_TYPE_TASK_MIGRATE message comes to the remote node, the remote will call which initiates clone remote thread. This function places a clone_remote_thread request onto popcorn's (globally-defined) workqueue (popcorn_wq, which is created via a call to create_singlethread_workqueue). This item on the workqueue is serviced by clone_remote_thread. clone_remote_thread looks at its global remote_contexts list so see if there is already a struct remote_context that matches the threads nid (node id) and tgid. If one exists, it adds this work request to the retrieved remote_context. If one does not exist, it allocates and initializes a struct remote_context, and then creates a remote_worker thread for this process.
This newly-created thread begins with a call to remote_worker_main, which initializes all process-related information, such as setting up the process' task_struct and its mm ( via __construct_mm). It then makes a call to __run_remote_worker which manages the process's own workqueue, looping until something requests that it stop (rc->stop_remote_worker, which is set when the node receives a PCN_KMSG_TYPE_TASK_EXIT_ORIGIN message). Finally, the remote_worker sends PCN_KMSG_TYPE_TASK_EXIT_ORIGIN requests to the origin node to ask it to terminate any threads associated with the current task that aren't the remote_worker thread itself.

Regardless of whether or not popcorn needed to initialize a remote_context and task_struct for the process, popcorn adds this clone request to the process-specific workqueue that is serviced by the process' remote_worker thread. The remote_worker services it with a call to __fork_remote_thread. __fork_remote_thread creates a new thread via a call to kernel_thread which begins execution with a call to remote_thread_main. remote_thread_main initializes the thread's task_struct (including register state) via an architecture-specific restore_thread_info. It then sends a PCN_KMSG_TYPE_TASK_PAIRING message to the origin node and returns -- causing the tread to jump into user-space.