# Access Control This lecture will focus on basic security considerations in an operating system, with focus on file systems, which are typically the most visible instance of access control in an OS. │ Lecture Overview │ │ 1. Multi-User Systems │ 2. File Systems │ 3. Sub-user Granularity We will first look at the motivation and implementation of «users», the basic unit of ownership and access control in an operating system. We will also look at some consequences and some applications of multi-user computing, and discuss how access control is implemented and enforced. In the second part, we will focus on the canonical case study in access control: file systems. Finally, the last part will explore what happens when per-user access control is not sufficient and we need a more fine-grained permission system. ## Multi-User Systems Multi-user systems had been the norm until the rise of personal computers circa mid-80s: earlier computers were too expensive and too bulky to be allocated to a single person. Instead, earlier systems used some form of multi-tenancy, whether implemented administratively (batch systems) or by the operating system (interactive, terminal-based computers). │ Users │ │ • originally a proxy for «people» │ • currently a more «general abstraction» │ • user is the unit of «ownership» │ • many «permissions» are user-centered The concept of a «user» has evolved from the need to keep separate accounts for distinct people (the eponymous users of the system). In modern systems, a «user» continues to be an abstraction that includes accounts for individual humans, but also covers other needs. Essentially, «user» is a unit of ownership, and of access control. │ Computer Sharing │ │ • computer is a (often costly) «resource» │ • efficiency of use is a concern │ ◦ a single user rarely exploits a computer fully │ • data sharing makes access control a necessity While efficient resource usage is what drove multi-tenancy of computer systems, it is the global shared file system that drove the requirement for access control: users do not necessarily wish to trust all other users of the system with access to their files. │ Ownership │ │ • various «objects» in an OS can be «owned» │ ◦ primarily «files» and «processes» │ • the owner is typically whoever «created» the object │ ◦ though ownership can be «transferred» │ ◦ restrictions usually apply The standard model of access control in operating systems revolves around «ownership» of «objects». Generally speaking, ownership of an object confers both rights (to manipulate the object) and obligations (owned objects count towards quotas). Depending on circumstances, object ownership may be transferred, either by the original owner, or by system administrators. │ Process Ownership │ │ • each «process» belongs to some user │ • the process acts «on behalf» of the user │ ◦ the process gets the same privilege as its owner │ ◦ this both «constrains» and «empowers» the process │ • processes are «active» participants The perhaps most important ownership relationship is between users and their processes. This is because processes execute code on behalf of the user, and all actions a user takes on a system are mediated by some process or another. In this sense, processes act on behalf of their owner and the actions they perform are subject to any restrictions which apply to the user in question. │ File Ownership │ │ • each «file» also belongs to some user │ • this gives «rights» to the «user» (or rather their processes) │ ◦ they can «read» and «write» the file │ ◦ they can «change permissions» or ownership │ • files are «passive» participants Like processes, files are objects which are subject to ownership. However, unlike processes, files are passive: they do not perform any actions. Hence in this case, ownership simply gives the owner certain rights to perform actions on the file (most importantly change access control rights pertaining to that file). │ Access Control Models │ │ • «owners» usually decide who can access their objects │ ◦ this is known as «discretionary» access control │ • in high-security environments, this is not allowed │ ◦ known as «mandatory» access control │ ◦ a central authority decides the policy There are two main approaches to access control: the common «discretionary» model, where owners decide who can interact with their files (or other objects, as applicable) and «mandatory», in which users are not trusted with matters of security, and decisions about access control are placed in the hands of a central authority. In both cases, the operating system grants (or denies) access to object based on an «access control policy»: however, only in the latter case this policy can be thought of as a coherent, self-contained document (as opposed to a collection of rules decided by a number of uncoordinated users). │ (Virtual) System Users │ │ • users are a useful ownership «abstraction» │ • various system services get their own ‘fake’ users │ • this allows them to «own files» and «processes» │ • and also «limit» their «access» to the rest of the OS Users have turned out to be a really useful abstraction. It is common practice that services (whether system- or application-level) run under special users of their own. This means that these service can own files and other resources, and run processes under their own identity. Additionally, it means that those services can be restricted using the same mechanisms that apply to ‘normal’ users. │ Principle of Least Privilege │ │ • entities should have «minimum» privilege required │ ◦ applies to «software» components │ ◦ but also to «human» users of the system │ • this «limits» the scope of «mistakes» │ ◦ and also of security compromises The «principle of least privilege» is an important maxim for designing secure systems: it tells us that, regardless of the subject and object combination, permissions should only be granted where there is genuine need for the subject to manipulate the particular object. The rationale is that mistakes happen, and when they do, we would rather limit their scope (and hence damage): mistakes cannot endanger objects which are inaccessible to the culprit. │ Privilege Separation │ │ • different parts of a system need different privilege │ • least privilege dictates «splitting» the system │ ◦ components are «isolated» from each other │ ◦ they are given only the rights they need │ • components «communicate» using very simple IPC An important corollary of the principle of least privilege is the design pattern known as «privilege separation». Systems which follow it are split into a number of independent components, each serving a small, well-defined and security-wise self-contained function. Each of these modules can be then isolated in their own little sandbox and communicate with the rest of the system through narrowly defined interfaces (usually built on some form of inter-process communication). │ Process Separation │ │ • recall that each process runs in its own «address space» │ ◦ «shared memory» must be explicitly requested │ • each «user» has a view of the «filesystem» │ ◦ a lot more is shared by default in the filesystem │ ◦ especially the «namespace» (directory hierarchy) There is not much need for access control of memory: each process has their own and cannot see the memory of any other process (with small, controlled exceptions created through mutual consent of the two processes). The file system is, however, very different: there is a global, shared namespace that is visible to all users and all processes. Moreover, many of the objects (files) are «meant» to be shared, in a rather ad-hoc fashion, either through ‘well-known’ paths (this being the case with many system files) or through passing paths around. Importantly, paths are «not» any sort of access token and in almost all circumstances, withholding a path does not prevent access to the object (paths can be easily discovered). │ Access Control Policy │ │ • there are 3 pieces of information │ ◦ the «subject» (user) │ ◦ the «action»/«verb» (what is to be done) │ ◦ the «object» (the file or other resource) │ • there are many ways to «encode» this information We have mentioned earlier, that the totality of the rules that decide which actions are allowed, and which disallowed, is known as an «access control policy». In the abstract, it is a rulebook which answers questions of the form ‘Is (subject) allowed to perform (action) on (object)?’ There are clearly many different ways in which this rulebook can be encoded: we will look at some of the most common strategies later. │ Access Rights Subjects │ │ • in a typical OS those are (possibly virtual) «users» │ ◦ sub-user units are possible (e.g. programs) │ ◦ «roles» and «groups» could also be subjects │ • the subject must be «named» (names, identifiers) │ ◦ easy on a single system, «hard» in a «network» The most common access control «subject» (at least when it comes to access policy «specification»), are, as was already hinted at, «users», whether ‘real’ (those that stand in for people) or virtual (which stand for services). In most circumstances, it must be possible to «name» the subjects, so that it's possible to refer to them in rules. Sometimes, however, rules can be directly attached to subjects, in which case there is no need for these subjects to have stable identifiers attached. │ Access Rights Actions (Verbs) │ │ • the available ‘verbs’ (actions) depend on «object» type │ • a typical object would be a «file» │ ◦ files can be «read», «written», «executed» │ ◦ «directories» can be «searched» or «listed» or «changed» │ • network connections can be established &c. The particular choice of actions depends on the object type: each such type has a fixed list of actions, which correspond to operations, or variants of operations, that the operating system offers through its interfaces. The actions may be affected by the policy directly or indirectly – for instance, the «read» permission on a file is not enforced at the time a ‹read› call is performed: instead, it is checked at the time of ‹open›, with the provision that ‹read› can be only used on file descriptors that are «open for reading». That is, the program is required to indicate, at the time of ‹open›, whether it wishes to read from the file. │ Access Rights Objects │ │ • anything that can be «manipulated» by «programs» │ ◦ although not everything is subject to access control │ • could be «files», «directories», «sockets», shared «memory», ... │ • object «names» depend on their type │ ◦ file paths, i-node numbers, IP addresses, ... Like subjects, objects need to have names unless the pieces of policy relevant to them are directly attached to the objects themselves. However, in case of objects, this direct attachment is much more common: it is rather typical that an i-node embeds permission information. │ Subjects in POSIX │ │ • there are 2 types of «subjects»: «users» and «groups» │ • each «user» can belong to «multiple groups» │ • users are split into «normal» users and ‹root› │ ◦ ‹root› is also known as the «super-user» In POSIX systems, there are two basic types of subjects that can appear in the access control policy: users and groups. Since POSIX only covers access control for the file system, objects do not need to be named: their permissions are attached to the i-node. A special user, known as ‹root›, represents the system administrator (also known as the super-user). This account is not subject to permission checking. Additionally, there is a number of actions (usually not attached to particular objects) which only the ‹root› user can perform (e.g. reboot the computer). │ User and Group Identifiers │ │ • users and groups are represented as «numbers» │ ◦ this improves «efficiency» of many operations │ ◦ the numbers are called ‹uid› and ‹gid› │ • those numbers are valid on a «single computer» │ ◦ or at most, a local network In the access control policy, users and groups are identified by numbers (each user and each group getting a small, locally unique integer). Since these identifiers have a fixed size, they can be stored very compactly in i-nodes, and can be also very efficiently compared, both of which have been historically important considerations. Besides efficiency, the numeric identifiers also make the layout of data structures which carry them simpler, reducing scope for bugs. │ User Management │ │ • the system needs a «database» of «users» │ • in a network, user «identities» often need to be «shared» │ • could be as simple as a «text file» │ ◦ ‹/etc/passwd› and ‹/etc/group› on UNIX systems │ • or as complex as a distributed database The user database serves two basic roles: it tells the system which users are authorized to access the system (more on this later), and it maps between human-readable user names and the numeric identifiers that the system uses internally. In local networks, it is often desirable that all computers have the same idea about who the users are, and that they use the same mapping between their names and id's. LDAP and Active Directory are popular choices for centralised network-level user databases. │ Changing Identities │ │ • each «process» belongs to a particular «user» │ • ownership is «inherited» across ‹fork()› │ • «super-user» processes can use ‹setuid()› │ • ‹exec()› can sometimes change a process owner Recall that all processes are created using the ‹fork› system call, with the exception of ‹init›. When a process forks, the child process inherits the ownership of the parent, that is, it belongs to the same user as the parent does (whose ownership is not affected by ‹fork›). However, if a process is owned by the super-user, it can change its owner by using the ‹setuid› system call. Additionally, ‹exec› can sometimes change the owner of the process, via the so-called ‹setuid› bit (not to be confused with the system call of the same name). The ‹init› process is owned by the super-user. │ Login │ │ • a super-user process manages «user logins» │ • the user types in their name and «password» │ ◦ the ‹login› program «authenticates» the user │ ◦ then calls ‹setuid()› to change the process owner │ ◦ and uses ‹exec()› to start a shell for the user You may recall that at the end of the boot process, a ‹login› process is executed to allow users to authenticate themselves and start a session. The traditional implementation of ‹login› first asks the user for their user name and password, which it checks against the user database. If the credentials match, the ‹login› program sets up the basic environment, changes the owner of the process to the user who just authenticated themselves and executes their preferred shell (as configured in the user database). │ User Authentication │ │ • the user needs to «authenticate» themselves │ • «passwords» are the most commonly used method │ ◦ the «system» needs to recognize the right password │ ◦ user should be able to change their password │ • «biometric» methods are also quite popular By far, the most common method of authenticating users (that is, ascertaining that they are who they claim they are) is by asking for a secret – a password or a passphrase. The idea is that only the legitimate owner of the account in question knows this secret. In an ideal case, the system does not store the password itself (in case the password database is compromised), but stores instead information that can be used to check that a password that the user typed in is correct. The usual way this is done is via (salted) cryptographic hash functions. Besides passwords, other authentication methods exist, most notably cryptographic tokens and biometrics. │ Remote Login │ │ • authentication over «network» is more complicated │ • «passwords» are easiest, but not easy │ ◦ «encryption» is needed to safely transmit passwords │ ◦ along with «computer authentication» │ • «2-factor» authentication is a popular improvement While password is simply short string that can be quite easily sent across a network, there are caveats. First, the network itself is often insecure, and the password could be snooped by an attacker. This means we need to use cryptography to transmit the password, or otherwise prove its knowledge. The other problem is, in case we send an encrypted password, that the computer at the other end may not be the one we expect (i.e. it could belong to an attacker). Since the user is not required to be physically present to attempt authenticating, this significantly increases the risk of attacks, making strong passwords much more important. Besides strong passwords, security can be improved by 2-factor authentication (more on this shortly). │ Computer Authentication │ │ • how to ensure we send the password to the «right party»? │ ◦ an attacker could «impersonate» our remote computer │ • usually via «asymmetric cryptography» │ ◦ a private key can be used to «sign» messages │ ◦ the server signs a challenge to establish its «identity» When interacting with a remote computer (via a network), it is rather important to ensure that we communicate with the computer that we intended to. While the most immediate concern is sending passwords, of course this is not the only concern: accidentally uploading secret data to the wrong computer would be as bad, if not worse. A common approach, then, is that each computer gets a unique private key, while its public counterpart (or at least its fingerprint) is distributed to other computers. When connecting, the client can generate a random challenge, and ask the remote computer to sign it using the secret key associated to the computer that we intended to contact, in order to prove its identity. Unless the target computer itself has been compromised, an attacker will be unable to produce a valid signature and will be foiled. │ 2-factor Authentication │ │ • 2 different types of authentication │ ◦ harder to spoof «both» at the same time │ • there are a few factors to pick from │ ◦ something the user «knows» (password) │ ◦ something the user «has» (keys, tokens) │ ◦ what the user «is» (biometric) Two-factor (or multi-factor) authentication is popular for remote authentication (as outlined earlier), since networks make attacks much cheaper and more frequent. In this case, the first factor is usually a password, and the second factor is a cryptographic «token» – a small device (often in the form of a keychain) which generates a unique sequence of codes, one of which the user transcribes to prove ownership of the token. Remote biometric authentication is somewhat less practical (though not impossible). Of course, two-factor authentication can be used locally too, in which case biometrics become considerably more attractive. Cryptographic tokens or smart cards are also common, though in the local case, they usually communicate with the computer directly, instead of relying on the user to copy a code. │ Enforcement: Hardware │ │ • all «enforcement» begins with the hardware │ ◦ the CPU provides a «privileged mode» for the kernel │ ◦ DMA memory and IO instructions are «protected» │ • the MMU allows the kernel to «isolate processes» │ ◦ and protect its own integrity Now that we have an access control policy and we have established the identity of the user, there is one last thing that needs to be addressed, and that is «enforcement» of the policy. Of course, an access control policy is useless if it can be circumvented. The ability of an operating system to enforce security stems from hardware facilities: software alone cannot sufficiently constrain other software running on the same computer. The main tool that allows the kernel to enforce its security policy is the MMU (and the fact that only the kernel can program it) and its control over interrupt handlers. │ Enforcement: Kernel │ │ • kernel uses «hardware facilities» to implement security │ ◦ it stands between «resources» and «processes» │ ◦ access is mediated through «system calls» │ • «file systems» are part of the kernel │ • «user» and «group» «abstractions» are part of the kernel Hardware resources are controlled by the kernel: memory via the MMU, processors via the timer interrupt, memory-mapped peripherals again through the MMU and through the interrupt handler table. Since user programs cannot directly access physical resources, any interaction with them must go through the kernel (via system calls), presenting an opportunity for the kernel to check the requested actions against the policy. │ Enforcement: System Calls │ │ • the kernel acts as an «arbitrator» │ • a process is trapped in its own «address space» │ • processes use system calls to access resources │ ◦ kernel can decide what to allow │ ◦ based on its «access control model» and «policy» When a system call is executed, the kernel knows the owner of that process, and also any objects involved in the system call. Armed with this knowledge, it can easily consult the access control policy to decide whether the requested action is allowed, and if it is not, return an error to the process, instead of performing the action. │ Enforcement: Service APIs │ │ • userland processes can enforce access control │ ◦ usually system services which provide IPC API │ • e.g. via the ‹getpeereid()› system call │ ◦ tells the caller «which user» is «connected» to a socket │ ◦ user-level access control relies on «kernel» facilities Just as the kernel sits on resources that user programs cannot directly access, the same principle can be applied in userspace programs, especially services. Probably the most illustrative example is a relational database: the database engine runs under a dedicated (virtual) user and stores its data in a collection of files. The permissions on those files are set such that only the owner can read or write them – hence, the kernel will disallow any other process from interacting with those files directly. Nonetheless, the database system can selectively allow other programs to «indirectly» interact with the data it stores: the programs connect to a database server using a UNIX socket. At this point, the database can ask the operating system to provide the user identifier under which the client is running (using ‹getpeereid›). Since the server can directly access the files which store the data, it can, on the behalf of the client, execute queries and return the results. It can, however, also disallow certain queries based on its own access control policy and the user id of the client. ## File Systems As outlined earlier, file systems are usually the most user-visible aspect of an operating system with access control applied to it. Additionally, permissions in the file system are usually directly visible to users and manipulated by them. │ File Access Rights │ │ • «file systems» are a case study in access control │ • all modern file systems maintain «permissions» │ ◦ the only extant «exception» is FAT (USB sticks) │ • different systems adopt different representation │ Representation │ │ • file systems are usually «object-centric» │ ◦ permissions are attached to individual objects │ ◦ easily answers “who can access this file”? │ • there is a «fixed» set of «verbs» │ ◦ those may be different for «files» and «directories» │ ◦ different «systems» allow «different verbs» │ The UNIX Model │ │ • each file and directory has a single «owner» │ • plus a single owning «group» │ ◦ not limited to those the owner belongs to │ • «ownership» and «permissions» are attached to «i-nodes» │ Access vs Ownership │ │ • POSIX ties «ownership» and «access» rights │ • only 3 subjects can be named on a file │ ◦ the owner (user) │ ◦ the owning group │ ◦ anyone else │ Access Verbs in POSIX File Systems │ │ • read: «read» a file, «list» a directory │ • write: «write» a file, «link»/«unlink» i-nodes to a directory │ • execute: ‹exec› a program, enter the directory │ • execute as owner (group): ‹setuid›/‹setgid› │ Permission Bits │ │ • basic UNIX «permissions» can be encoded in «9 bits» │ • 3 bits per 3 subject designations │ ◦ first comes the owner, then group, then others │ ◦ written as e.g. ‹rwxr-x–-› or ‹0750› │ • plus two numbers for the owner/group identifiers │ Changing File Ownership │ │ • the owner and ‹root› can change file owners │ • ‹chown› and ‹chgrp› system utilities │ • or via the C API │ ◦ ‹chown()›, ‹fchown()›, ‹fchownat()›, ‹lchown()› │ ◦ same set for ‹chgrp› │ Changing File Permissions │ │ • again available to the owner and to ‹root› │ • ‹chmod› is the user space utility │ ◦ either numeric argument: ‹chmod 644 file.txt› │ ◦ or symbolic: ‹chmod +x script.sh› │ • and the corresponding system call (numeric-only) │ ‹setuid› and ‹setgid› │ │ • «special permissions» on «executable» files │ • they allow ‹exec› to also change the process owner │ • often used for granting extra privileges │ ◦ e.g. the ‹mount› command runs as the «super-user» │ Sticky Directories │ │ • file creation and deletion is a «directory» permission │ ◦ this is problematic for «shared directories» │ ◦ in particular the system ‹/tmp› directory │ • in a «sticky» directory, different rules apply │ ◦ new files can be created as usual │ ◦ only the «owner» can «unlink» a file from the directory │ │ Access Control Lists │ │ • ACL is a list of ACE's (access control «elements») │ ◦ each ACE is a subject + verb pair │ ◦ it can name an arbitrary user │ • ACL is attached to an object (file, directory) │ • more flexible than the traditional UNIX system │ ACLs and POSIX │ │ • part of POSIX.1e (security extensions) │ • most POSIX systems implement ACLs │ ◦ this does «not» supersede UNIX permission bits │ ◦ instead, they are interpreted as part of the ACL │ • «file system» support is not universal (but widespread) │ Device Files │ │ • UNIX represents «devices» as «special i-nodes» │ ◦ this makes them subject to normal «access control» │ • the particular device is described in the «i-node» │ ◦ only a «super-user» can create device nodes │ ◦ users could otherwise gain access to any device │ Sockets and Pipes │ │ • «named» sockets and pipes are just «i-nodes» │ ◦ also subject to standard file permissions │ • especially useful with «sockets» │ ◦ a service sets up a «named socket» in the file system │ ◦ «file permissions» decide who can talk to the service │ Special Attributes │ │ • flags that allow «additional restrictions» on file use │ ◦ e.g. «immutable» files (cannot be changed by anyone) │ ◦ «append-only» files (for logfile integrity protection) │ ◦ compression, copy-on-write controls │ • «non-standard» (Linux ‹chattr›, BSD ‹chflags›) │ Network File System │ │ • NFS 3.0 simply transmits numeric ‹uid› and ‹gid› │ ◦ the numbering needs to be «synchronised» │ ◦ can be done via a «central user database» │ • NFS 4.0 uses «per-user» authentication │ ◦ the user authenticates to the server directly │ ◦ filesystem ‹uid› and ‹gid› values are mapped │ File System Quotas │ │ • «storage space» is limited, «shared» by users │ ◦ files take up storage space │ ◦ file ownership is also a «liability» │ • «quotas» set up «limits» space use by users │ ◦ exhausted quota can lead to «denial» of «access» │ Removable Media │ │ • access control at «file system» level makes no sense │ ◦ other computers may choose to «ignore» permissions │ ◦ «user names» or id's would not make sense anyway │ • option 1: «encryption» (for denying reads) │ • option 2: «hardware»-level controls │ ◦ usually read-only vs read-write on the entire medium │ The ‹chroot› System Call │ │ • each process in UNIX has its own «root directory» │ ◦ for most, this coincides with the «system root» │ • the root directory can be changed using ‹chroot()› │ • can be useful to «limit» file system «access» │ ◦ e.g. in «privilege separation» scenarios │ Uses of ‹chroot› │ │ • ‹chroot› alone is «not» a security mechanism │ ◦ a super-user process can «get out» easily │ ◦ but not easy for a «normal user» process │ • also useful for «diagnostic» purposes │ • and as lightweight alternative to «virtualisation» ## Sub-User Granularity In this section, we will explore a few cases where a more precise notion of an access control subject is required or useful. │ Users are Not Enough │ │ • users are not always the right abstraction │ ◦ «creating users» is relatively «expensive» │ ◦ only a super-user can create new users │ • you may want to include «programs» as «subjects» │ ◦ or rather, the combination user + program One of the main drawbacks of the user-centric security paradigm is heavyweight and requires super-user privileges. Moreover, normal users cannot easily constrain processes under auxiliary users (only via a ‹setuid› helper, which must again be configured by the ‹root› user). A natural extension of the concept of an «access control subject» is to include the currently running program in the description – allowing the policy to say things like ‹/home/xuser/mail› can be accessed by thunderbird (a mail client) running under the account of ‹xuser›, but not by firefox (a web browser) running under the same account. │ Naming Programs │ │ • users have user names, but how about programs? │ • option 1: cryptographic «signatures» │ ◦ «portable» across computers but «complex» │ ◦ establishes «identity» based on the «program itself» │ • option 2: i-node of the «executable» │ ◦ simple, local, identity based on «location» Unfortunately, attaching policy rules to programs is much harder than it is for files or users, since their identity is rather elusive. There might be any number of programs called thunderbird, some of which may be different versions or builds of the same software, but some might just claim to be thunderbird to get to one's email. A fairly good, if complicated, solution is to embed a cryptographic signature into executables, stating the rough equivalent of ‘this program is Firefox, signed by Mozilla’. Assuming we trust Mozilla (we probably do since we run their software), we can refer to ‘Firefox by Mozilla’ in our access control policy. A variation of this approach is used by mobile operating systems, like Android and iOS. The other option, much simpler, is to add a note like ‘this program is Firefox‘ to the i-node of the executable. This approach is used by systems like SELinux (where the note is realized as a «security label»). │ Program as a Subject │ │ • program: passive (file) vs active (processes) │ ◦ only a «process» can be a subject │ ◦ but program «identity» is attached to the file │ • rights of a «process» depend on its «program» │ ◦ ‹exec()› will change privileges Now that we have managed to delineate what is a program and how to identify it, a new problem pops up: in both cases, we have attached the identity to a file, but it actually belongs to a process. However, processes being much more dynamic than files, assigning identifiers to them is even less practical. In this case, we can use the same trick that was used for ‹setuid› programs: the ‹exec› system call can examine the binary and adjust the privileges of the process accordingly. │ Mandatory Access Control │ │ • delegates permission control to a «central authority» │ • often coupled with «security labels» │ ◦ classifies «subjects» (users, processes) │ ◦ and also «objects» (files, sockets, programs) │ • the owner «cannot» change object permissions Security labels are, in some sense, a generalisation of user groups. They can be attached to both objects and subjects, and ‹exec› will update the labels attached to a process based on the labels attached to the executable (file). Under mandatory access control, the users are not allowed to change permissions on objects. However, in practical systems, both modes are usually combined: discretionary permissions are attached to files as usual, and applied to an action whenever the mandatory rules alone would have allowed it. │ Capabilities │ │ • not all verbs (actions) need to take objects │ • e.g. shutting down the computer (there is only one) │ • mounting file systems (they can't be always named) │ • listening on ports with number less than 1024 The term ‘capabilities’ is often used to mean one of two forms of access control policy rules: 1. where the object is a singleton, i.e. there is only a single object for the given action, or 2. where it is impractical to name the objects or to attach permission information to them.   │ Dismantling the ‹root› User │ │ • the traditional ‹root› user is «all-powerful» │ ◦ “all or nothing” is often unsatisfactory │ ◦ violates the principle of least privilege │ • many special properties of ‹root› are capabilities │ ◦ ‹root› then becomes the user with all capabilities │ ◦ other users can get selective privileges In many cases, the simple split between ‹root› and normal users (which, incidentally, mirrors the split between the kernel and user programs) is inadequate. There are three principal ways to address this: 1. ‹setuid› programs can extend some of the special ‹root›-only privileges to normal users (e.g. ‹mount›, ‹passwd›), 2. the system of «capabilities» adds the option of allowing certain users to perform some of the restricted operations, 3. the user-level approach mentioned at the end of section 1, where the service runs under ‹root› (e.g. PolicyKit).   │ Security and Execution │ │ • security hinges on what is «allowed to execute» │ • «arbitrary code execution» are the worst exploits │ ◦ this allows «unauthorized» execution of code │ ◦ same effect as «impersonating» the user │ ◦ almost as bad as stolen credentials Control over which code can execute (and with what privileges) is at the center of all access control restrictions. If a program can be tricked into executing code supplied by an attacker, all the privileges that the program had are automatically available to the attacker as well. │ Untrusted Input │ │ • programs often process «data» from «dubious sources» │ ◦ think image viewers, audio & video players │ ◦ archive extraction, font rendering, ... │ • bugs in programs can be «exploited» │ ◦ the program can be «tricked» into «executing data» The most common way programs can be hijacked in this manner is through improper processing of «untrusted inputs», that is, content coming from untrustworthy sources. If unexpected input data can derail program execution, this opens the door for an attacker to take control of the program. The payload (the code that the attacker wants executed) is usually supplied as part of the input, and hence is normally treated as data by the program. However, in presence of certain bug, the program can be tricked into executing (or interpreting) this data as code. │ Process as a Subject │ │ • some privileges can be tied to a particular «process» │ ◦ those only apply during the «lifetime» of the process │ ◦ often «restrictions» rather than privileges │ ◦ this is how «privilege dropping» is done │ • restrictions are «inherited» across ‹fork()› Programs (or parts of programs running in a separate process) can ask the operating system to remove some of their privileges (like file system access, network access, and so on). There are many ways to do this, though they are not very portable (i.e. they depend on non-POSIX features of particular operating systems, e.g. Linux user namespaces, seccomp, FreeBSD Capsicum, OpenBSD ‹pledge› and ‹unveil› and so on). One of the few portable approaches, known as privilege drop, is essentially a subset of privilege separation: a special user is created for the particular process and the process, after having done any privileged initialization operations that it needed to do, uses ‹setuid› and perhaps ‹chroot› to lock itself down. │ Sandboxing │ │ • tries to «limit damage» from code execution «exploits» │ • the program «drops» all privileges it can │ ◦ this is done «before» it touches any of the «input» │ ◦ the attacker is stuck with the «reduced privileges» │ ◦ this can often prevent a successful attack Sandboxing is a collection of techniques (including some of the above) that tries to minimize the impact of a successful exploit against a program. Sandboxing can be voluntary (the program sets up its own sandbox) and involuntary (see also next slide). │ Untrusted Code │ │ • traditionally, you would only execute «trusted» code │ ◦ often based on «reputation» or other «external» factors │ ◦ this does not «scale» to a large number of vendors │ • it is common to execute «untrusted», even dubious code │ ◦ this can be okay with sufficient «sandboxing» Running code from questionable sources is always risky, but is essentially guaranteed to result in a compromise unless precautions are taken. However, since the modern web is full of executable code, we simply resort to locking it down as much as we can and hope for the best. │ API-Level Access Control │ │ • capability system for «user-level resources» │ ◦ things like contact lists, calendars, bookmarks │ ◦ objects not provided directly by the kernel │ • enforcement e.g. via a «virtual machine» │ ◦ not applicable to execution of «native code» │ ◦ alternative: an IPC-based API Selectively granting permissions to programs through user-level permission systems is also possible for non-root users. There are two commonly employed methods: 1. a (program-level) virtual machine, like the JVM or the javascript virtual machines built into web browsers, which enforce that the program only talks to the system through restricted APIs, 2. a strict sandbox with the only access to the system provided by a daemon running on the outside of the sandbox (e.g. snap and flatpak, to a degree). Both approaches can be combined, with a common technique locking a VM using OS-level sandboxing to defend against security bugs in the VM itself. │ Android/iOS Permissions │ │ • applications from a store are «semi-trusted» │ • typically «single-user» computers/devices │ • permissions are attached to «apps» instead of users │ • partially virtual users, partially API-level On Android, for instance, each application gets its own virtual user with very limited permissions and interaction with the system is done almost exclusively through high-level APIs. These APIs then perform permission checks, possibly prompting the user for confirmation as needed. │ Review Questions │ │ 37. What is a user? │ 38. What is the principle of least privilege? │ 39. What is an access control object? │ 40. What is a sandbox?