Friedrich-Alexander-Universität Erlangen-Nürnberg  /   Technische Fakultät  /   Department Informatik
Assignment 2: System Calls

Our OS lacks a synchronous way back from user mode (ring 3) to kernel (ring 0). To overcome this issue, you have to introduce a system call interface.

1. Interrupt-based System Calls

A very basic approach is triggering a software interrupt (vector 0x80 like Linux, for example) to switch to the kernel.

Preparation

A trap via the int instruction is a privileged operation: When used in ring 3 without special preparation, it would just cause a General Protection Fault (exception number 13) instead of the desired system call vector in ring 3. However, it is possible to modify the Interrupt Descriptor Table (IDT) and allow triggering a specific system call vector from ring 3. Additionally, since our interrupt_handler is designed for device interrupts, you are advised to create a custom system call entry (in assembly) with a corresponding high-level handler function (in C++). Register the new entry function using IDT::handle() for your system call vector (the parameter dpl defines the allowed privilege level).

Passing Arguments

Since you switch the stack from user to kernel, passing arguments over the stack would be highly cumbersome. You should better stick to passing the arguments via registers only. Conveniently, this is also the default for (the first six) function parameters according to the x64 SystemV ABI (Section 3.2.3).

For the upcoming assignments, you should prepare for system calls with up to five parameters. Since all system calls use the same vector, you'll also need an identifier to distinguish them – you can handle them in a big switch case.

It is strongly recommended to extensively test the passing of all five arguments and the return value using a custom test system call.

Note
If done properly, you are able to avoid copying or saving/restoring registers – it's not just less assembly code but also faster.

Functionality provided by the System Calls

Implement the following system calls (while using a reasonable semantic):

size_t write(int fd, const void *buf, size_t len);
size_t read(int fd, void *buf, size_t len);
void sleep(int ms);
int sem_init(int semid, int value);
void sem_destroy(int semid);
void sem_wait(int semid);
void sem_signal(int semid);

Separating each function in a stub (for ring 3) and skeleton (ring 0) part (with the system call handler acting as dispatcher) might improve the readability of your code structure. It may also be advisable to employ the write system call in a OutputStream compatible wrapper to retain the simple and accustomed output functionality.

2. Fast System Calls

While interrupt-based system calls are fairly easy, they cause significant overhead leading to a notable performance degradation. To overcome this issue, lightweight mechanisms like sysenter/sysexit (Intel) and syscall/sysret (AMD) have been introduced. Due to compatibility reasons x64 systems use the latter one.

Preparation

To prepare your system for those fast system calls, you should start by adjusting the GDT according to the required layout described in Intel's manual at 5.8.8 Fast System Calls in 64-Bit Mode for Model Specific Register MSR_STAR. The pointer to a new assembly entry function (calling the system-call-handler already implemented for the interrupt-based variant) needs to be assigned to MSR_LSTAR. In MSR_SFMASK you can define all bits which should automatically be cleared from the flags register upon executing syscall (the 9th bit,interrupt enabled", might be an excellent idea). And above all: don't forget to enable the syscall instruction by setting the MSR_EFER_SCE bit in the ExtendedFeature Enable Register".

Attention
To return from the kernel during a fast system call, you have to use the instruction o64 sysret in NASM (without the o64 only a 32-bit sysret will be performed)!

Passing Arguments

Arguments can be passed by register, similar to the interrupt-based approach. However, the rcx register (4th parameter) is also used by the syscall instruction – maybe you can work around this issue by switching the parameter to another (scratch) register?

Functionality provided by the System Calls

Each fast system call should provide the same functionality as the corresponding interrupt-based one. To distinguish between them you can prefix them with something like fast_. You probably want to use a custom test function to validate the passing of the maximum number of parameters.

For the extended exercise, you have to introduce an additional system call

void nop(void);

for both variants – which does exactly what you would expect from its name: no-operation, nothing. Its only purpose is performance analysis (see below).

Benchmark (7.5 ECTS)

Use your benchmark experience gathered in the previous assignment to measure the performance of both variants using the nop system call. Bear in mind that some of the proposed instructions might not work in ring 3.