Friedrich-Alexander-Universität Erlangen-Nürnberg  /   Technische Fakultät  /   Department Informatik
Assignment 1: Protection

Videos (in German)

Preparation

In this course we will extend our StuBS OS (developed during the operating systems lecture's exercises) with common isolation features. A basic StuBS implementation to build upon is provided via GitLab – you only have to choose between the two flavors:

  • OOStuBS (single core)
  • MPStuBS (multi core, required for 7.5 ECTS)

Please note: We strongly recommend to use the provided source. Since there is quite a good chance that your own StuBS code from last semester still contains undiscovered bugs/issues and probably lacks some useful features (like a dynamic allocator), you should avoid extending it and stick to our skeleton 😉.

In contrast to the operating system exercises, you will not receive any updated sources from us (unless we have urgent bug fixes) – so it is all up to you how you structure your code for the upcoming assignments. However, this also means you will have to stick to it until the end of the semester, so better try to organize it as good as possible!

User Applications in Ring 3

In this course, you will successively extend your operating system to use modern isolation techniques, fully separating each user process from the kernel and other user processes.

As a first step, you have to modify StuBS in a way that any application code always runs in protection ring 3 and only the handling of interrupts (especially time slice scheduling interrupts) is performed on ring 0.

Global Descriptor Table

Currently, StuBS employs a Global Descriptor Table (GDT) with an entry for kernel (ring 0) only in long mode. In order to be able to execute code in user mode, you have to extend the GDT with entries for code and data in ring 3. Additionally, you have to introduce a descriptor entry for each Task State Segment (TSS, see below).

Further information and a detailed description of the data structures can be found in the Intel Software Developer’s Manual Volume 3 in section 3.4.5 Segment Descriptors.

Introduce User/Kernel Stacks

Code running in ring 0 should use an extra kernel stack (separate from the user stack). For this, we have to exploit the ancient x86 mechanism for hardware tasks. The TSS structure is, similar to the GDT, a silent reminder of this previously intended purpose – and nowadays most of the content is there for legacy reasons only. Modern OS (like StuBS, of course) use software tasks instead and the TSS is solely used to keep track of the kernel stack pointer: During a switch from ring 3 to 0, the CPU changes the stack pointer according to the corresponding value stored in the TSS. Consequently, a separate TSS (and hence a descriptor for it in the GDT, see above) is required for each core: For a single-core system like OOStuBS, one TSS is sufficient, while a multi-core system requires one for each possible core (Core::MAX in MPStuBS). Using the task register (load instruction ltr), each core is able to determine (through an indirection via the GDT) the location of its TSS. The Intel manual states a description of the TSS structure and load procedure in Section 7.2 Task Management Data Structures.

On each task switch in Dispatcher you have to change the TSS so it will contain the kernel stack pointer of the current thread – extend Thread / StackPointer accordingly.

Attention
To avoid stack overflows with all its strange behavior, you should make sure to always set the top-of-(kernel-)stack in the TSS instead of the current (kernel-)stack pointer.

Initial Switch to Ring 3

The start of user threads uses a kickoff function similar to Thread::kickoff. However, the new kickoff has to perform a switch to user mode (ring 3) before it is able to call the target function (Application::action()). To do the switch, you will have to exploit the fact that interrupted threads will automatically return to their previous ring using iret: You have to fake an interrupt stack, in which the last two bits of the segment selector entries specify the desired protection level (ring 3 in your case).

In case you struggle with the structure of the stack layout for interrupts, Section 6.12 Exception and Interrupt Handling of the Intel manual might enlighten you.

Attention
Please be aware that we still need kernel threads running on ring 0 (e.g., IdleThread)!

Also note that privileged instructions are not permitted in ring 3. While this should sound obvious, the usage of such instructions in our code might not be so obvious at all: kout uses the hardware cursor, which is accessed through IOPort (inb and outb) – which are allowed in ring 0 only (and will leave you behind with a General-Protection-Fault (GPF)). To avoid these obstacles, it is sufficient for your Application to just print to a custom TextStream (without hardware cursor) in an endless loop (without calling any additional functions like GuardedBell::sleep()).

At some point (after you've fixed all the GPFs), you might find yourself asking "Am I already/really in user mode now...?". There is an easy approach to answer this question: Just take the code segment register (cs, use inline assembly) and check its last two bits – if set, you are (finally) in ring 3!

CPU Core Local Storage (required in MPStuBS)

In some way, StuBS already has core local variables: For example, the epilogue queue in Guard consists of an array for each core and employs Core::getID() (which itself utilizes LAPIC::getID() in conjunction with a look-up table) for each access. A more enhanced approach employs the currently unused gs segment register (the other extra segment, fs, is reserved for thread-local storage according to the SystemV ABI (Section 10.3)): On each core, gs points to a different part of memory (set via Core::MSR) having the same structure each (for example implemented as a struct array with Core::MAX elements, ideally cache aligned). Access it in assembly with gs:OFFSET, whereas OFFSET is the byte position of the required element in the struct (the GCC intrinsic __builtin_offsetof can be useful).

Since an application in user mode can access gs as well, we want to have a separate value for the kernel, swapped on each ring switch using the assembly instruction swapgs. The initial value is stored in the Model Specific Register MSR_GS_BASE, which gets swapped with MSR_SHADOW_GS_BASE during this instruction.

Note
Make sure to only swap the segment on an actual ring switch (an interrupt while executing code in ring 0 should not swap the segment!).

For this exercise, you have to implement a getID() function using this technique, which should return the same value as Core::getID(). Measure the performance of both functions using the TSC according to Intels How to Benchmark Code Execution Times in an emulator (Qemu/KVM) and on real bare-metal hardware.

Attention
Don't forget to use CPUID to check if required instructions like rdtscp are available on the current (virtual) hardware!

After a successful verification, you can modify the PerCore wrapper to use the new getID(), which will slightly improve the performance of Guard and Scheduler (however, the impact might not really be noticeable to the user).