BetterOS.org : an attempt to make computer machines run better

BetterOS.org : an attempt to make computer machines run better


home | better linux | games | software | tutorials | reference | web log |
index | C | x86_64 assembly | riscv32 assembly | riscv64 assembly | C 1 (old) | C 2 (old) | C 3 (old) | C 4 (old) | low-level graphics |
introduction

This tutorial intends to be a practical tutorial to assembly programming for the RISCV architecture and running Linux. This platform is relatively young, and there is a possibility that future implementations of this platform may change, or that the tools for development on this platform may change. RISCV is an open source instruction set and there already exist several different soft-core implementations, emulators, and companies promising future silicon implementations. In the case that there is a difference in these implementations which is relevant to this tutorial, you should assume that this tutorial was written for the SiFive HiFive Unleashed development board, which is (at the time of writing) the only available development board featuring a RISCV 64 processor.
Because the RISCV ISA is so new, it has understandably been written about much less than older platforms, and good assembly language tutorials are few and far between (if any exist at all). This tutorial intends to remedy the situation.

hardware and software

At the time of this writing, there exists one single development board for RISCV 64 with a real silicon chip that runs Linux, if you have that board, you are set for hardware. I sincerely hope that new chips and platforms emerge in the near future, if you happen to have one of those, then you should be able to follow along just as easily. If you have neither of those options available, then I recommend using a RISCV 64 emulator. I personally like qemu, since I am already familiar with it and I like how it works, but there are many others (like SPIKE, which is supposedly the "gold standard"). You should be able to follow along on an emulator just as easily. In fact, if you choose qemu, then you can use qemu's user mode emulation for most of this tutorial, and do all of the work on another linux based platform. There are also implementations of RISCV that can be loaded onto an FPGA. Personally, I do not think that this approach is ideal, since I have had some stability issues running RISCV cores on FPGAs, but you might have more luck than I did.
For software, you will need a copy of GNU binutils that targets RISCV 64. If you are developing on a non-RISCV platform (such as x86), then you will need a cross binutils, which is much easier to compile than the full toolchain. However, SiFive provides a set of scripts which will take care of building the whole cross toolchain for you (on github), and is probably the easiest option for most people. You will of course need a text editor and a terminal emulator and normal unix-like tools. Beyond that, there isn't much else required.

assembly introduction

Programming in assembly language is different than programming in other languages. In other programming languages, you learn the basic constructs and concepts basically once, and then other languages come easy because they are all almost the same. However, assembly is not like other languages and knowing other languages will not really help you when learning assembly. In fact, even prior knowledge of assembly language for one platform doesn't really translate to assembly for another platform. There are some general concepts that carry over, and some operating system knowledge is generally applicable regardless of the processor, but the bulk of assembly is dependant on the ISA you are tagetting. For this reason, I never say that one "learns assembly", instead, one learns the target platform.
RISCV is a little bit different than other architectures, it has been designed from scratch with excellent and well thought out design goals, and doesn't carry over the poor decisions from legacy systems. It even includes some future-proofing in its design as it can be extented to a 128 bit architecture.
The core ISA itself is quite small, and litterally fits on one page. Likewise, the RISCV extention instructions make up about one page of text. However, that doesn't mean progamming in RISCV assembly will be a walk in the park. If you are familiar with x86, then you might be used to having an instruction available for pretty much everything you want to do, but in RISCV, it might take several instructions to accomplish the same thing. In addition, RISCV is "load/store" based, which means that you won't be able to do things like add a value from memory to a register like you can in x86. So RISCV assemble can require a little more thought than the CISC approach.

hello world

Below is a RISCV64 implementation of the classic "hello world" program. If you have ever written assembly for another platform, this may look a bit odd, notice that it has no "mov" instructions, and many instructions take 3 operands. Like I said, RISCV is different... Don't panic though, we will go through this code and explain what it all means.
.section .data msg: .string "hello world" .byte 10 .section .text .globl _start _start: addi a7,x0,64 addi a0,x0,1 lui a1,%hi(msg) addi a1,a1,%lo(msg) addi a2,x0,12 ecall addi a7,x0,93 xor a0,a0,a0 ecall

If you are completely unfamiliar with assembly language, this probably looks very complicated and confusing, but don't worry. It really isn't as complicated as it looks, and we will go over the whole thing thoroughly.

Before we explain the code, however, we need to talk about how programs work. In modern operating systems, executables are divided up into parts called sections (or segments). Each section contains a different type of information, and when the operating system loads the executable into memory, it gives each section it's own permissions (like read, write, and execute). If a program violates these section permissions, the operating system flags this as an error and stops the program, this is known as a "segmentation fault" (which may already be familiar to you).
Programming in assembly is all about designing the executable, very little of it is done automatically. So, our assembly code needs to be divided into sections, just like the executable.
The .data section gets loaded as readable and writable. This makes it very well suited for storing data, but not well suited for storing code, since it is not executable. The .text section, however, is marked as readable and executable, which makes it suitable for storing executable code, as well as read-only data. In assembly, we can use the assembler directive ".section" to tell the compiler that everything after that point (and before the next .section directive) should be placed into the specified section.

In case you were confused, the term I used, "directive" does have a very specific meaning. A "directive" is code that only conveys information to the assembler (the program which processes the code). It directs the assembler to do something, but it doesn't result in anything actually being added to the final executable. This contrasts an "instruction", which we will learn about soon.

Look at the first line in the data section. You might guess that this is a variable, but you would be wrong. Assembly doesn't really have variables. Instead, it has a simpler mechanism called labels. A label is a way for you to reference a specific memory location elsewhere in your code. "msg" here is a label, and since it is the first thing in the .data section, msg will point to the first thing in the .data section.
Following the label is the text ".string". If you are familiar with another programming language, you might guess that this is a type. However, assembly doesn't have types, instead, it uses a much simpler mechanism. .string, is actually what is known as a psuedo-operation, or psuedo-instruction.
psuedo-operations are very similar to actual instructions. Both take operands, and both are encoded directly into the final exeutable. The difference, however, is that psuedo-operations are not able to be decoded and executed by the processor, even if they are in an executable area of memory. The .string psuedo operation takes a quoted list of characters and encodes that into the executable. In this case, the operand is the string "hello world", so .string results in the text "hello world" getting added to the .data section of the executable. Also, since this text is located at the beginning of the .data section, and we already created a label for the beginning of the data section, the label "msg" is effectively points to the data we are adding.

Next, we move on to the .text section. As I already mentioned, the .text section is executable, so this is where we will put our actual program code. However, before we start writing code, we need to define where the start of our program is. The actual start of the program is not decided by the assembler at all, its in one of the executable file's headers, which get put there by the linker. However, by default, the linker will look for a symbol called "_start", and use that as the program entry point. We have a label named _start, but this in itself isn't good enough. Labels normally aren't included in the final executable's symbol table (which is the list of named functions and data within the file, along with addresses where those parts of the executable should be loaded), they are only known by the assember at assembly time. For the linker to find the _start symbol, it needs to get exported to the symbol table. To accomplish this, we use the .globl assembler directive, this tells the assembler to include the symbol in the symbol table.

Now that we have created the _start symbol and told the assembler that we want it to be exported, we can start to write the actual executable code, the fun part.
Each line of executable code consists of an instruction and usually a few operands. Which operands are needed depends on the instruction being used. Each instruction gets encoded, along with all its operands into 32 bits each on RISCV (except for the special "compressed" instructions).

What we need to do is make a system call to tell linux to print "hello world" in the terminal. In order to do that, we will need to put the right values into the right registers, and then execute the right instruction to trigger a system call. The first of these values we need to set is the system call number. The system call we want is the "write" system call, which is number 64 on RISCV. System call number varies from platform to platform, as do methods of making syscalls, so the one we will be using is specific to RISCV, as are the system call numbers. To make a system call on RISCV, we need to fill registers a0 through a6 with the arguments for the system call, and the system call number goes into the a7 register. This gives us the ability to make syscalls with up to 7 arguments, which is fine because there are no system calls defined with more than 6.

What exactly is a register though? I'm glad you asked. Most people are familiar with RAM (memory), and if you are a C programmer, you probably have experience working directly with it. However, computers (especially RISCV) can't actually perform most types of operations directly on data stored in memory, instead, most instructions operate on registers, which is a type of storage located on the processor chip. This storage is very fast, its the fastest data storage on your computer. In fact, in assembly language, we consider memory access to be slow. However, there is only a very limited number of registers available. RISCV has 32 general purpose registers. Of those, 31 are 64 bits wide on RISCV64, and one (x0) is 0 bits wide. Each register has its own name, on RISCV they are simply named x0 through x31, but most have an alias that helps to signify how they should be used. For instance, the registers used for storing arguments, a0-a7, are actually aliases for registers x10-x17. Now that we are armed with that knowledge, let's actually start setting up the needed registers.

If you are familiar at all with x86 assembly, you might think that we will want to use some kind of move instruction to load a value into a register. However, RISCV doesn't actually have a move instruction. So, we need another method to load a value into a register. RISCV actually has a really convenient way to accomplish this however. The addi instruction, which stands for "add immediate". Remember I told you that there were 31 64-bit registers, and one 0 bit register? That x0 register has might seem worthless, but it actually has some very useful properties. When we write to the x0 register, nothing happens, but if we read from it, we will always get 0. It still might seem pretty useless, but, there is one little detail which changes things, the addi instruction takes three operands, a destination register, a source register, and an immediate value to add to it. So, the three operand add immediate instruction can be used in the exact same way as a two operand move immediate instruction by adding an immediate to the x0 register and storing it in the destination. This is what we are doing here, adding the system call number to x0, and storing the result in a7, which is where the system call number should be when the system call is triggered.

Now we mentioned "immediates" last paragraph, but let's define exactly what is an immediate. An immediate is number specified directly as an operand to an instruction. This includes any literal number, but it will also include labels, since a label is really just a human readable placeholder for a memory address; when the program is assembled, the assember will actually replace the usage of a label with the correct memory address for that label. Immediates are special because they actually get encoded into the instruction, which is great because that avoids an additional memory access after the processor decodes the instruction. In RISCV, each instruction gets encoded as a 32 bit value. This 32 bit value contains the instructions 'opcode' and all its operands in binary format. So naturally this puts some limits on how long an immediate can be, and some instructions have more or less space reserved for immediates than others do, depending on how long the opcode is and how many operands there are, so RISCV needs to employ a few "tricks" to load larger immediate values into a register.
The first "trick" is the lui instruction, which stands for "load upper immediate". Along with the %hi() assembler function, the lui instruction loads the upper 20 bits of a 32 bit immediate into the upper 20 bits of a register. Once the upper 20 bytes are loaded into a register, we can then use the addi instruction, along with the %lo() function to add in the lower 12 bits, making up a 32 bit value. In RISCV64, this value gets sign extended to a 64 bit.



Please note that this tutorial is not yet complete, it was last updated: March 24th, 2018, I am working on it frequently. If you have any comments or corrections, please email me at prushik@betteros.org.