# A. Preliminaries These are lecture notes for PB152. Everything that you need to know to pass the subject (excluding the seminar, which is a separate subject – PB152cv) is contained in this document. The lectures cover the same material in a different format (the slides which you see in this document are the slides used in the lectures). │ Tests and Exam │ │ • 3 short interim tests (4 lectures each) │ ◦ 2 of the review questions per lecture (8 total) │ ◦ must pass 2 out of 3 tests (otherwise X) │ • exam at the end │ ◦ part 1: review questions │ ◦ part 2: in-depth understanding │ • all tests are one-way (no revision of answers) To pass the subject, you will need to take at least 2 of the 3 interim tests that will be made available in the IS. Each of those tests will cover 4 lectures – you will get 2 of the review questions (which are listed at the end of each lecture) for each of them. You are allowed one mistake (i.e. you must answer 7 out of the 8 questions correctly for the test to count as passed). If you fail 2 of the 3 tests, you will be graded X. If you pass 2 (or all 3) tests, you can take the exam, which will have two parts – the first will be like the interim tests, only there will be 12 questions on it, one from each lecture. You are likewise allowed to make 1 mistake. If you pass, you proceed to the second part, which is more about in-depth understanding, and which will decide your final grade. You must pass both parts of the exam together – if you fail one, you need to retake both. All tests (i.e. the interim tests and both parts of the exam) will be done in the IS and will show you one question at a time, which you must answer before you can look at the next question. You won't be able to revisit answers that you have already submitted. The training tests will be like this too, so that you can get a feel for how to best allocate your time. │ Interim Tests │ │ • 8 questions, 15 minutes, at most 1 mistake │ • training test 1 week prior, unlimited attempts │ • when – between 16:00 and 16:30 on: │ ◦ 9th of April │ ◦ 7th of May │ ◦ 4th of June The interim tests will be held on Fridays in the afternoon. You will have 4 weeks to study the corresponding 4 lectures, then one week to review (with the help of a training test, if you like). The training test will be open starting the Friday prior and will be open until the interim test starts. You can take the training test as many times as you like. You will get an email reminder about each interim test one week prior (at the same time the training test is published) and in the morning on the day of the test. │ Exam │ │ • part 1: 12 review questions, 20 minutes │ ◦ 10 or less = F, 11+ = go to part 2 │ • part 2: 12 questions, 90 minutes │ ◦ assess the truth of more complex statements │ ◦ +1 / -0.5 points per question │ ◦ 6+ = E, 7+ = D, 8+ = C, 9+ = B, 10+ = A │ • training tests in May, unlimited attempts In May, a training version of both parts will be made available. You will be able to take either any number of times, though the selection of questions (and answers) will be limited, compared to the real test. In the second part, each question will present 4 statements (a short paragraph) about some aspect of operating systems. There will be two possible scenarios: either • 3 statements are false and 1 is true (in which case you select the true one), or • 3 statements are true and 1 is false (in which case you select the false one). All tests and exams will be in Czech, with each technical term also appearing in English (in brackets). If you want to take the exam in English, contact me by 19th of March at the latest. │ Seminars │ │ • a separate, optional course (code «PB152cv») │ • covers operating systems from a practical perspective │ • get your hands on the things we'll talk about here │ • offers additional practice with C programming The seminar is a good way to gain some practical experience with operating systems. It will mainly focus on interaction of programs with OS services, but will also cover user-level issues like virtualisation, OS installation and shell scripting. │ Study Materials │ │ • «lecture notes» are the main study text │ ◦ these include the slides used in the lecture │ ◦ mostly self-contained with minimal dependencies │ ◦ English version already available in the IS │ ◦ translated chapters will be published weekly │ • «lecture recordings» │ ◦ from the 2019 run, as supplementary material You are reading the lecture notes for the course PB152 Operating Systems. This is your main study resource; it is based on the lecture slides, but with additional details that would not fit into the slide format. These lecture notes should be self-contained in the sense that they only rely on knowledge you have from other courses, like PB150 Computer Systems (or PB151) or PB071 Principles of Low-Level programming. Likewise, being familiar with the topics covered in these lecture notes is sufficient to pass the exam. │ Books │ │ • there are a few good OS books │ • you are encouraged to get and read them │ │ • A. Tanenbaum: Modern Operating Systems │ • A. Silberschatz et al.: Operating System Concepts │ • L. Skočovský: Principy a problémy OS UNIX │ • W. Stallings: Operating Systems, Internals and Design │ • many others, feel free to explore The books mentioned here usually cover a lot more ground than it is possible to include in a single-semester course. The study of operating systems is, however, very important in many sub-fields of computer science, and also in most programming disciplines. Spending extra time on this topic will likely be well worth your time. │ Topics │ │ 1. Anatomy of an OS │ 2. System Libraries and APIs │ 3. The Kernel │ 4. File Systems │ 5. Basic Resources and Multiplexing │ 6. Concurrency and Locking In the first half of the semester, we will deal with the basic components and abstractions used in general-purpose operating systems. The first lecture will simply give an overview of the entire OS and will attempt to give you an idea how the parts fit together. In the second lecture, we will cover the basic programming interfaces provided by the OS, provided mainly by system libraries. │ Topics (cont'd) │ │ 7. Device Drivers │ 8. Network Stack │ 9. Command Interpreters & User Interfaces │ 10. Users and Permissions │ 11. Virtualisation & Containers │ 12. Special-Purpose Operating Systems The second half of the semester will start with device drivers, which form an important part of operating systems in general, since they mediate communication between application software and hardware peripherals connected to the computer. In a similar fashion, the network stack allows programs to communicate with other computers (and software running on those other computers) that are attached to a computer network. │ Related Courses │ │ • PB150/PB151 Computer Systems │ • PB153 Operating Systems and their Interfaces │ • PA150 Advanced OS Concepts │ • PV062 File Structures │ • PB071 Principles of Low-level programming │ • PB173 Domain-specific Development in C/C++ There is a number of courses that overlap, provide prerequisite knowledge or extend what you will learn here. The list above is incomplete. The course PB153 is an alternative to this course. Most students are expected to take PB071 in parallel with this course, even though knowledge of C won't be required for the theory we cover. However, C basics will be needed for the optional seminar (PB152cv). │ Organisation of the Semester │ │ • generally, «one lecture = one topic» │ • there will be most likely 13 lectures │ • the 13th lecture will be review │ • «3 interim tests»: 2.4., 30.4., 28.5. # B. Semester Overview This section gives a high-level overview of the topics that will be covered in individual lectures. Think of it as an extended table of contents, or as a collection of abstracts, one for each of the upcoming lectures. │ 2 System Libraries and APIs │ │ • «POSIX»: Portable Operating System Interface │ • UNIX: (almost) everything is a «file» │ • the least common denominator of programs: C │ • user view: objects, archives, shared libraries │ • «compiler», linker System libraries and their APIs provide the most direct access to operating system services. In the second lecture, we will explore how programs access those services and how the system libraries tie into the C programming language. We will also deal with basic artifacts that make up programs: object files, archive files, shared libraries and how those come about: how we go from a C source file all the way to an executable file through compilation and linking. Throughout this lecture, we will use POSIX as our go-to source of examples, since it is the operating system interface that is most widely implemented. Moreover, there is abundance of documentation and resources both online and offline. │ 3 The Kernel │ │ • «privileged» CPU mode │ • the boot process │ • boundary enforcement │ • kernel designs: micro, «mono», exo, ... │ • «system calls» In the third lecture, we will focus on the kernel, arguably the most important (and often the most complicated) part of an operating system. We will start from the beginning, with the boot process: how the kernel is loaded into memory, initialises the hardware and starts the user-space components (that is, everything that is not the kernel) of the operating system. We will then talk about boundary enforcement: how the kernel polices user processes so they cannot interfere with each other, or with the underlying hardware devices. We will touch on how this enforcement makes it possible to allow multiple users to share a single computer without infringing on each other (or at least limiting any such infringement). Another topic of considerable interest will be how kernels are designed and what is and what isn't part of the kernel proper. We will explore some of the trade-offs involved in the various designs, especially with regards to security and correctness vs performance. Finally, we will look at the «system call» mechanism, which is how the user-space communicates with the kernel, and requests various low-level operating system services. │ 4 File Systems │ │ • why and how │ • abstraction over shared «block storage» │ • directory «hierarchy» │ • everything is a file revisited │ • i-nodes, directories, hard & soft links Next up are file systems, which are a very widely used abstraction on top of persistent block storage, which is what hardware storage devices provide. We will ask ourselves, first of all, why filesystems are important and why they are so pervasively implemented in operating systems, and then we will look at how they work on the inside. In particular, we will explore the traditional UNIX filesystem, which offers important insights about the architecture of the operating system as a whole, and about important aspects of the POSIX file semantics. │ 5 Basic Resources and Multiplexing │ │ • «virtual memory», processes │ • sharing CPUs & «scheduling» │ • processes vs threads │ • «interrupts», clocks One of the basic roles of the operating system is management of various resources, starting with the most basic: the CPU cores and the RAM. Since those resources are very important to every process or program, we will spend the entire lecture on them. In particular, we will look at the basic units of resource assignment: threads for the CPU and processes for memory. We will also look at the mechanisms used by the kernel to implement assignment and protection of those resources, namely the virtual memory subsystem and the scheduler. │ 6 Concurrency and Locking │ │ • inter-process «communication» │ • accessing «shared resources» │ • mutual exclusion │ • «deadlocks» and deadlock prevention Scheduling and slicing of CPU time is closely related to another important topic that pervades operating system design: concurrency. We will take a high-level, introductory look at this topic, since the details are often complicated, architecture-specific and require deep understanding of both hardware (SMP, cache hierarchies) and of kernels. │ 7 Device Drivers │ │ • user vs kernel drivers │ • interrupts &c. │ • GPU │ • PCI &c. │ • block storage │ • network devices, wifi │ • USB │ • bluetooth One of the fundamental roles of an operating system is to mediate access to hardware devices. Some of the code that provides hardware access deals mainly with the software interfaces and APIs – this is known as hardware abstraction. However, to make this abstraction work, there is often a large amount of device-specific (or at least device-class-specific) ‘glue’ – also known as device drivers. One of the important questions will be the interplay between processor-level protections and direct hardware access and what this means for drivers. We will see that for some (but not all) types of hardware, only privileged programs (either inside the kernel, or close to it) can reasonably mediate between hardware itself and between higher levels of the system (hardware abstraction layer, application software, etc.). │ 8 Network Stack │ │ • TCP/IP │ • name resolution │ • socket APIs │ • firewalls and packet filters │ • network file systems While there is a dedicated course about networking, we will spend one of our lectures talking about networks: in modern operating systems, networking is an integral part of the package and networking considerations often influence other parts of the system. We will look at the ubiquitous TCP/IP stack, how it integrates into an operating system and what are the APIs that applications can use to take advantage of network services. We will also touch on network-related functionality that is often deeply integrated into operating systems: packet filtering and network file systems. │ 9 Command Interpreters & User Interfaces │ │ • «interactive» systems │ • history: consoles and terminals │ • «text-based» terminals, RS-232 │ • bash and other Bourne-style shells, POSIX │ • «graphical»: X11, Wayland, OS X, Windows, Android, iOS The next lecture will focus on human-computer interaction, which is clearly a central aspect of the experience of using a computer and is therefore an important part of most general-purpose operating systems. Even computers that do not directly (physically) interact with humans usually present some form of an interface, usually mediated over the network. We will first look at ‘traditional’ text-based interfaces, which are still in common use among system and network engineers and computer programmers, but we will also look in some depth at the graphics stacks that power modern devices (up to and including smartphones). │ 10 Users and Permissions │ │ • «multi-user» systems │ • «isolation», ownership │ • file system «permissions» │ • capabilities There are two important use-cases for computers (and hence operating systems) in which higher-level access control and permission management is important: first, when a single computer is shared by multiple users (this is the more traditional case), but in more modern times, also whenever we execute untrusted or semi-trusted programs on our devices (think application permissions on smartphones, web pages that execute javascript on your laptop and so on). │ 11 Virtualisation & Containers │ │ • resource multiplexing redux │ • «isolation» redux │ • multiple kernels on a single system │ • type 1 and type 2 «hypervisors» │ • ‹virtio› A computer, along with its operating system, is a natural ‘unit’ of computation resources – it conveniently packages up the resources themselves, with a software stack and configuration. Unfortunately, computers – being physical devices – are somewhat inflexible and unwieldy: they have to be procured, placed in racks in air-conditioned rooms, attached to a power source, to each other and to the larger network. Their physical components are prone to wear and failure, and need to be replaced or repaired regularly. Virtualization makes it possible to detach the logical aspects of a computer – its installed software, data storage and configuration – from the physical box. This improves hardware utilization, decouples hardware maintenance from software aspects and makes everyone's life easier (most of the time, anyway). In this lecture, we will peek under the hood of modern hypervisor-based virtual machines and how they are implemented in the current generation of operating systems. │ 12 Special-Purpose Operating Systems │ │ • general-purpose vs special-purpose │ • «embedded» systems │ • «real-time» systems │ • high-assurance systems (seL4) Throughout most of the course, we will have talked about general-purpose operating systems: those that run on personal computers and servers. The last lecture will be dedicated to more specialised systems: those that run in washing machines, on satellites or on the Mars rovers. We will also briefly cover high-assurance systems, which focus on extremely high reliability and/or security.