The Adventures of OS

RISC-V OS using Rust

This is chapter 1 of a multi-part series on writing a RISC-V OS in Rust.

Table of Contents → Chapter 0 → (Chapter 1) → Chapter 2

I Need Your Support

Writing these posts is a past-time right now as my full time job is educating (mostly) undergraduate college students. I will always deliver content, but I could really use your help. If you're willing, please support me at Patreon (sgmarz)

I've just started and there is much to do! So, please join me!

Taking control of RISC-V

27 September 2019

Overview

Booting into RISC-V is fairly simple. There are many different ways, but I'm going to choose my own way. That is--start at the physical memory address 0x8000_0000. Luckily, QEMU reads ELF files, so it will know to put our code at that address. Throughout this process, we will glean a lot of information by looking at the qemu/hw/riscv/virt.c source code contained in QEMU. First, the memory map is listed right in the beginning:


static const struct MemmapEntry {
	hwaddr base;
	hwaddr size;
} virt_memmap[] = {
	[VIRT_DEBUG] =       {        0x0,         0x100 },
	[VIRT_MROM] =        {     0x1000,       0x11000 },
	[VIRT_TEST] =        {   0x100000,        0x1000 },
	[VIRT_CLINT] =       {  0x2000000,       0x10000 },
	[VIRT_PLIC] =        {  0xc000000,     0x4000000 },
	[VIRT_UART0] =       { 0x10000000,         0x100 },
	[VIRT_VIRTIO] =      { 0x10001000,        0x1000 },
	[VIRT_DRAM] =        { 0x80000000,           0x0 },
	[VIRT_PCIE_MMIO] =   { 0x40000000,    0x40000000 },
	[VIRT_PCIE_PIO] =    { 0x03000000,    0x00010000 },
	[VIRT_PCIE_ECAM] =   { 0x30000000,    0x10000000 },
};

So, our machine starts at byte 0 of the DRAM (VIRT_DRAM), which starts at address 0x8000_0000. When we get a little bit further along, we will be programming the CLINT (0x200_0000), PLIC (0xc00_0000), UART (0x1000_0000), and VIRTIO (0x1000_1000). Don't worry about what these mean now, but it's handy to see what will be coming up!

After we've done that, we need to accomplish the following in RISC-V assembly:

Pick a bootloader CPU (typically id #0)
Clear the BSS section to 0
Go to Rust!

RISC-V assembly resembles MIPS assembly, except we don't have to prefix our registers. All of the instructions come from the RISC-V specification, which you can get your very own copy at: https://github.com/riscv/riscv-isa-manual. We are writing for RV64GC (RISC-V 64 bits, general extensions, and compressed instructions extension).

Pick a bootloader

At this point in time, we don't want to worry about parallelism, race conditions, or anything else that comes with that territory. Instead, we let one of our CPU cores (called HARTs [hardware threads] by RISC-V) do all the work. We are now going to dive into the privileged specification to figure out what register's we're talking about here. So, grab a copy at: https://github.com/riscv/riscv-isa-manual.
We will start in the depths of chapter 3.1.5 (Hart ID Register mhartid). This register will tell us our hart number. According to the spec, we must always have a hart id #0. So, let's use that one to boot.
Create a file boot.S in your src/asm/ directory. In here, we will boot into Rust.


# boot.S
# bootloader for SoS
# Stephen Marz
# 8 February 2019
.option norvc
.section .data

.section .text.init
.global _start
_start:
	# Any hardware threads (hart) that are not bootstrapping
	# need to wait for an IPI
	csrr	t0, mhartid
	bnez	t0, 3f
	# SATP should be zero, but let's make sure
	csrw	satp, zero
.option push
.option norelax
	la		gp, _global_pointer
.option pop

3:
	wfi
	j	3b

In here, csrr means "control status register read", so we read our hart identifier into the register t0 and see if it is zero. If it isn't, we send it to be parked (busy loop). Afterward, we set the supervisor address translation and protection (satp) register to 0. This is how we will eventually control the MMU. Since we haven't the need for virtual memory yet, we disable it by writing zero into it with csrw (control status register write). The reset vector of some boards will load mhartid into that hart's a0 register. However, some boards may choose not to do this, so I've opted to get the hart ID straight from the horse's mouth.

Clear the BSS

Global, uninitialized variables get the value 0 since these are allocated in the BSS section. However, since we're the OS, we are responsible for making sure that memory is 0. Luckily, with our linker script, there are two fields defined for us called _bss_start and _bss_end which tell us where the BSS section starts and ends, respectively. So, we add the following just below .option pop and right before 3:.


	# The BSS section is expected to be zero
	la 		a0, _bss_start
	la		a1, _bss_end
	bgeu	a0, a1, 2f
1:
	sd		zero, (a0)
	addi	a0, a0, 8
	bltu	a0, a1, 1b
2:

In here, we use sd (store doubleword [64-bits]) to store zero into the memory address a0 which progressively moves towards _bss_end.

Going into Rust

Since most people or normal human beings don't like to stay in assembly for too long, we jump into Rust as soon as possible. Although, some might say that programming in Rust is the hard part. We won't be fighting with the borrow checker too much.

To get into rust and put the CPU in a predictable mode, we will use the mret instruction which is the trap return function. This allows us to set the mstatus register to our privlege mode. So, we add the following to boot.S:


# Control registers, set the stack, mstatus, mepc,
# and mtvec to return to the main function.
# li		t5, 0xffff;
# csrw	medeleg, t5
# csrw	mideleg, t5
la		sp, _stack
# We use mret here so that the mstatus register
# is properly updated.
li		t0, (0b11 << 11) | (1 << 7) | (1 << 3)
csrw	mstatus, t0
la		t1, kmain
csrw	mepc, t1
la		t2, asm_trap_vector
csrw	mtvec, t2
li		t3, (1 << 3) | (1 << 7) | (1 << 11)
csrw	mie, t3
la		ra, 4f
mret
4:
	wfi
	j	4b

There is a lot here, and some is commented out. However, what we're doing is setting bits [12:11] to 11, which is "machine mode". This will give us access to all of the instructions and registers. Granted, we're probably already in that mode, but let's do it again.

> Then bit [7] and bit [3] will enable interrupts at a coarse level. However, we will still need to enable particular interrupts through the mie (machine interrupt enable) register, which we do at the end.

The mepc register is the "machine exception program counter", which is the memory address we are going to return to. The symbol kmain is defined in Rust and is our escape ticket out of assembly.

The mtvec (machine trap vector), is a kernel function that will called whenever there is a trap, such as a system call, illegal instruction, or even a timer interrupt.

We set ra (return address) to park after we're done with Rust's main function. Then, the mret instruction takes everything we just did and jumps back through the mepc register, which is where we finally enter Rust!

(added 29-Sep-2019) We've referenced asm_trap_vector, but we haven't written it. We will soon, however, for now, create a file called trap.S under src/asm/ and add the following to it:


# trap.S
# Assembly-level trap handler.
.section .text
.global asm_trap_vector
asm_trap_vector:
    # We get here when the CPU is interrupted
	# for any reason.
    mret

The world of bare-metal Rust!

Now that we've come to rust, we need to edit lib.rs, which was created for us by the cargo command. Do not change the name of lib.rs, otherwise cargo will never know what we're talking about. Instead, lib.rs will be our entry point, utilities, and what we use to import other Rust modules. Don't think of kmain as an execution code. Instead, it will initialize everything we need and then cause the "big bang", that is, get everything in motion. Operating systems are mainly asynchronous. We will be using a timer interrupt to cattle-prod our kernel into action, so we can't use the single-threaded programming approach we might be used to.

When you open lib.rs, delete everything in it. There is nothing we need in there or anything we can use for our kernel. Instead, per Rust, we need to define a few things. Since we aren't using the standard library (it isn't built for our kernel anyway), we have to define abort and panic_handler before we can continue. So, here goes nothing.


// Steve Operating System
// Stephen Marz
// 21 Sep 2019
#![no_std]
#![feature(panic_info_message,asm)]

// ///////////////////////////////////
// / RUST MACROS
// ///////////////////////////////////
#[macro_export]
macro_rules! print
{
	($($args:tt)+) => ({

	});
}
#[macro_export]
macro_rules! println
{
	() => ({
		print!("\r\n")
	});
	($fmt:expr) => ({
		print!(concat!($fmt, "\r\n"))
	});
	($fmt:expr, $($args:tt)+) => ({
		print!(concat!($fmt, "\r\n"), $($args)+)
	});
}

// ///////////////////////////////////
// / LANGUAGE STRUCTURES / FUNCTIONS
// ///////////////////////////////////
#[no_mangle]
extern "C" fn eh_personality() {}
#[panic_handler]
fn panic(info: &core::panic::PanicInfo) -> ! {
	print!("Aborting: ");
	if let Some(p) = info.location() {
		println!(
					"line {}, file {}: {}",
					p.line(),
					p.file(),
					info.message().unwrap()
		);
	}
	else {
		println!("no information available.");
	}
	abort();
}
#[no_mangle]
extern "C"
fn abort() -> ! {
	loop {
		unsafe {
            // The asm! syntax has changed in Rust.
            // For the old, you can use llvm_asm!, but the
            // new syntax kicks ass--when we actually get to use it.
			asm!("wfi");
		}
	}
}

We use the #![no_std] to tell Rust that we won't be using the standard library. Then we ask Rust to allow the panic info message and inline-assembly features for our code. The first thing we do is create an empty eh_personality function. The #[no_mangle] turns off Rust's name mangling so the symbol is exactly eh_personality. Then, the extern "C" tells Rust to use C-style ABI.

Then, the #[panic_handler] tells Rust that the very next function we define will be our panic handler. Rust calls panic for several reasons, and we will be implicitly calling it with our assertions. What I have this function do is print out the source file and line number of what caused the panic. However, we haven't written print! or println!, yet, but we know the format of print and println from Rust. As a side note, the -> ! means that this function won't return. If Rust detects that it can return, the compiler will give you an error.

Finally, we write the abort function. All this does is keep looping with the wfi (wait for interrupt) instruction. This powers down the hart it's running on until another interrupt.

WE ARE IN RUST!

We are officially in rust, so we need to write the entry point that we specified in boot.S, which was kmain. So, adding to our lib.rs code:


#[no_mangle]
extern "C"
fn kmain() {
	// Main should initialize all sub-systems and get
	// ready to start scheduling. The last thing this
	// should do is start the timer.
}

When kmain returns, it hits that wfi loop and hangs. This is what we expect since we really haven't told the kernel to do anything, yet.

So, there you have it. We're in Rust. Unfortunately, until we write what print! actually does, we won't se anything printed to the screen. But, everything should compile! Typically, good writers end with some quote or closing statement, but I'm not a good writer.

When you type make run, your operating system will try to boot and enter Rust, which it will. However, since we haven't written the driver to communicate with the OS, nothing will appear. Type CTRL-A and then hit 'x' to quit the emulator. Also, you can see where you are by typing CTRL-A and then hit 'c'. You are in the QEMU console. Type 'info registers' to see where in your OS the emulator is.

Chapter 0 → (Chapter 1) → Chapter 2