This is chapter 6 of a multi-part series on writing a RISC-V OS in Rust.
Table of Contents → Chapter 5 → (Chapter 6) → Chapter 7
Processes are the whole point of the operating system. We want to start doing "stuff", which we'll fit into a process and get it going. We will update the process structure in the future as we add features to a process. For now, we need a program counter (which instruction is executing), and a stack for local memory.
We will not create our standard library for processes. In this chapter, we're just going to write kernel functions and wrap them into a process. When we start creating our user processes, we will need to read from the block device and start executing instructions. That's quite a ways a way, since we will need system calls and so forth.
Each process must contain some information so that we can run it. We need to know some information, such as all of the registers, the MMU table, and the process' state.
The process structure isn't finished. I debated finishing it, but I think taking it step-by-step will help us understand what is needed and why that section needs to be stored in a process structure.
The process contains a stack, which we are going give to the sp
(stack pointer) register. However, this stack shows the top of the stack, so we need to add the size. Remember that a stack grows from high memory to low memory (you subtract to allocate and you add to deallocate).
The program counter will contain the address of the instruction to be executed next. We can get this through the mepc
register (machine exception program counter). When we interrupt our process either with a context-switch timer or an illegal instruction, or as a response to a UART input, the CPU will put the instruction about to be executed in mepc
. We can restart our process by going back here and start executing instructions again.
We are currently on a single-processor system. We will upgrade this to a multi-hart system, but for now, it makes much more sense just doing this for now. The trap frame allows us to freeze frame the process while we're handling it via the kernel. This is a MUST when switching back and forth. Could you imagine if a register you just set was changed by the kernel?
A process is defined by its context. The TrapFrame shows what it looks like on the CPU: (1) the general purpose registers (32 of these), (2) the floating point registers (32 of these too), (3) the MMU, (4) a stack for handling this process' interrupt context. I also store the hartid so that we don't have the same process running on two harts at the same time.
We need to allocate memory for the process' stack and the process structure itself. We will be using the MMU, so we can give it a known starting point for all processes. I chose 0x2000_0000. When we make external processes and compile them, we need to know the starting memory address. Since this is virtual memory, it essentially can be anything we want.
The Process
structure's implementation is fairly simple, right now. Rust requires us to have some value for all fields, so we create these helper functions to do just that.
When we create a process, we're not really "creating" it. Instead, we're allocating memory to hold the meta-data of a process. For now, all of the process code is stored as a Rust function. Later, we will load the binary instructions from the block device and execute that way. Notice that all we need to do is create a stack. I allocated 2 pages for a stack, giving the stack 4096 * 2 = 8192 bytes. This is small, but it should be more than enough for what we're doing.
Afterward, we allocate the page table and map all relevant portions for when the MMU is turned on. Keep in mind that we'll be executing in user mode, so all kernel functions will be off limits and cause a page fault should we want to access it. In order to fully implement the process, we need system calls. These allow a user space process to request services, such as printing to the console.
The process at the CPU level is handled whenever we hit a trap. We're going to use the CLINT timer as our context switch timer. For now, I've allocated a full 1 second to each process. This is way too slow for normal operation, but it makes it easier to debug as we see a process step through its execution.
Every time the CLINT timer hits, we are going to store the current context, jump into Rust (via m_trap in trap.rs), and then handle what needs to be done. I've chosen to use machine mode to handle interrupts since we don't have to worry about switching the MMU (it's off in machine mode) or running into recursive faults.
The CLINT timer is controlled through MMIO as follows:
The mtimecmp
is the register that stores a time in the future. When the mtime
hits this value, it interrupts the CPU with a "machine timer interrupt" cause. We can use this periodic timer to interrupt our process and switch to another. This is a technique called "time slicing", where we allocate a slice of the CPUs time to each process. The faster the timer interrupts, the more processes we can shovel through in a given time. However, each process can only execute a fraction of the instructions. Furthermore, the context switch (the m_trap_handler) code gets executed for each and every interrupt.
The timer frequency is somewhat of an art. Linux went through this debate when they offered the 1,000Hz (one-thousand interrupts per second). Meaning that each process was able to run 1/1000th of a second. Granted, with the speed of processors today, that's a ton of instructions, but back in the day, this was controversial.
Our trap handler must be able to handle different trap frames since any process can be running (or even the kernel) when we need to hit the trap handler.
We have to manually manipulate the TrapFrame
fields, so it is important to know the offsets. Anytime we change the TrapFrame
structure, we need to make sure our assembly code reflects this change.
In the assembly code, we can see the first thing we do is freeze the currently running process. We're using the mscratch
register to hold the TrapFrame of the currently executing process (or kernel--which uses KERNEL_TRAP_FRAME). The csrrw
instruction will atomically store the value of the t6
register and return the old value (the trap frame's memory address) into t6. Remember, we have to keep ALL registers pristine, and this is an excellent way to do so!
We use the t6
register out of convenience. It is register number 32 (index 31), so it is the very last register when we loop through to save or restore the register.
I use GNU's .altmacro
to allow for a macro loop as I've shown above to save and restore the registers. Many times you'll see all 32 registers being saved or restored, but the way I've done it cleans up the code significantly.
This portion of the trap handler is used to forward the parameters to the rust function m_trap
, which looks like the following:
By ABI convention, a0 is the epc, a1 is the tval, a2 is the cause, a3 is the hart, a4 is the status, and a5 is a pointer to the trap frame. You'll also notice that we return a usize. This is used to return the memory address of the next instruction to execute. Notice that the next instruction after calling m_trap
is to load the a0 register into mepc. Also by ABI convention, the return value from Rust is stored in a0.
Our scheduling algorithm (later) will require at least one process at all times. Just like Linux, we're going to call this process init
.
The point of this chapter is to show how a process will look in the abstract view. I'm going to improve upon this chapter by adding a scheduler and showing a process actually in motion. This will be the basis for our operating system. One additional thing we need for a process to actually do anything is to implement system calls, so you can look forward to that!
Table of Contents → Chapter 5 → (Chapter 6) → Chapter 7