# Shells & User Interfaces This lecture will focus on human-computer interaction and the role of an operating system in this area. We will look at both text-based interaction modes (mainly command-line interfaces, i.e. «shells») and at graphical interfaces, driven by a pointing device (mouse, trackpad) or a touch screen. │ Lecture Overview │ │ 1. Command Interpreters │ 2. The Command Line │ 3. Graphical Interfaces The first part will focus on «shell» as a simple programming language, while in the second we will briefly look at terminals (or rather terminal emulators), interactive use of «shell» and at other text-mode programs. Finally, the third part will be about graphical interfaces and how they are built. ## Command Interpreters Historically, shells play a dual role in most operating systems. Command-driven interaction is probably the easiest to implement, and was hence what computers and operating systems initially used. As soon as interactive terminals became available, that is (we will skip batch-mode systems today). Any command interpreter has one interesting property though: you can make a transcript of a sequence of commands to achieve more complex tasks than any individual command can perform. This works without any involvement from the command interpreter itself: you can simply write them down on a piece of paper, and type them back later. Now of course it would be much more convenient, if the computer could read the commands one by one from a file and execute them, as if you were typing them. This is the origin of shell scripts. │ Shell │ │ • «programming language» centered on OS interaction │ • rudimentary «control flow» │ • untyped, text-centered «variables» │ • dubious error handling Of course, in your hardcopy or handwritten notes, you could include additional remarks and instructions, like only run this command if the previous one succeeded, or repeat this command 3 times, or repeat this command until such and such thing happens. Wouldn't it be wonderful, though, if you could include such annotations in the transcript that the computer reads and performs? But of course, we have just invented control flow. And quite obviously this is exactly what shells came to implement. The other ‘obvious’ invention is placeholders in commands, to be replaced by appropriate values at execution time. For instance, you write down, in your paper notebook, a sequence of commands to update a list of users stored in a text file: you would presumably use a placeholder for the name of the file you are currently working with. And when you type the commands back, replace every occurrence of this placeholder with the real filename in question. But why, you have just invented variables! Another thing that sort of carries over from these paper-based scripts into the executable sort is error handling… or rather the lack thereof. It so happens that you wouldn't bother instructing yourself that you should stop and investigate if one of the commands from your notebook fails unexpectedly. │ Interactive Shells │ │ • almost all shells have an «interactive mode» │ • the user inputs a single statement on keyboard │ • when confirmed, it is immediately «executed» │ • this forms the basis of «command-line interfaces» Before we go on about control flow and variables, let us remind ourselves that most shells are interactive in nature. In this interactive mode, the user enters a single ‘statement’ (a single line) and confirms it, after which it is immediately executed. Most often this is a single command, but it can be a sequence, a loop or any other construct allowed by the language: there is no distinction in the kinds of syntax available in shell scripts and the interactive command line. This makes it possible to write short scripts (so-called ‘one-liners’) directly on the command line, to automate simple tasks, without bothering to write the program down. Learning to do it is well worth the investment, as it can save considerable time in day-to-day work. │ Shell Scripts │ │ • a «shell script» is an (executable) file │ • in simplest form, it is a «sequence of commands» │ ◦ each command goes on a separate line │ ◦ executing a script is about the same as typing it │ • but can use «structured programming» constructs In contrast to interactive command execution, a «shell script» is a file with a list of statements in it, executed sequentially. Of course, as discussed above, basic control flow is available to alter this sequential execution, if needed. Variables can be used to substitute in parts of commands that change from one invocation of the script to the next. │ Shell Upsides │ │ • very easy to write simple scripts │ • first choice for simple automation │ • often useful to save repetitive typing │ • definitely «not» good for big programs So how does shell compare as a programming language? First of all, it can be very productive and very easy to use, especially in scenarios where you are not programming as such, but really just automating simple tasks that you would otherwise do manually, by typing in commands. However, try to create anything bigger and the limitations become significant: larger programs cannot just drop dead whenever something fails, nor can they ignore errors left and right (the two basic strategies available in scripts). The lack of structured data, a type system, and general ‘programming hygiene’ makes larger scripts fragile and hard to maintain. The next logical step is then a dedicated ‘scripting’ language, like Perl or Python, which make a compromise between the naivety (and simplicity) of shell and the structure and rigour of heavyweight programming languages like C++ or Java. │ Bourne Shell │ │ • a specific language in the ‘shell’ family │ • the first shell with consistent programming support │ ◦ available since 1976 │ • compatible shells are still widely used today │ ◦ best known implementation is ‹bash› │ ◦ ‹/bin/sh› is mandated by POSIX¹ │ │ ¹ Strictly speaking, only the existence of ‹sh› is required, and its │ exact location is unspecified (except that it must be on the │ default PATH). Practically all POSIX-compatible systems put ‹sh› │ under ‹/bin›, along with the rest of mandatory commands and │ utilities. The Bourne shell was created in 1976 and essentially codified the dual nature of shells as both interactive and programmable. We still use its basic model (and syntax) today. There are many Bourne-compatible shells, many of them descended from the Korn shell (‹ksh›, which we will discuss shortly). You may have heard of ‹bash›: the name stands for Bourne Again Shell² and it is probably the most famous shell that there is (to the extent that some people believe it is the only shell). ² Because bad puns should be a human right. │ C Shell │ │ • also known as ‹csh›, first released in 1978 │ • more C-like syntax than ‹sh› (Bourne Shell) │ ◦ but not really very C-like at all │ • improved interactive mode (over ‹sh› from '76) │ • also still used today (mainly via ‹tcsh›) Historically, the second well-known UNIX shell was the C shell² – it made improvements in interactive use, many of which were adopted into other shells, among others: • command history (ability to recall already executed commands), • aliases (user-defined shortcuts for often-used commands), • command and filename completion (via ‹tcsh›), • interactive job control. The ‹tcsh› branch is a variant of ‹csh› with additional features, maintained alongside the original ‹csh› since early 80's. It is still distributed with many operating systems (and is, for instance, the default ‹root› shell on FreeBSD). ² What is it with computer people and bad puns? │ Korn Shell │ │ • also known as ‹ksh›, released in 1983 │ • middle ground between ‹sh› and ‹csh› │ • basis of the POSIX.2 requirements │ • a number of implementations exist In essence a fusion of ‹sh› (the Bourne shell) and ‹csh›/‹tcsh› (mainly as a source of improved user interaction; the scripting syntax remained faithful to ‹sh›). The original was based on ‹sh› source code, with many features added. This is the shell that POSIX uses as a model for ‹/bin/sh›. │ Commands │ │ • typically a name of an executable │ ◦ may also be control flow or a built-in │ • the executable is looked up in the filesystem │ • the shell does a ‹fork› + ‹exec› │ ◦ this means new process for each command │ ◦ process creation is fairly expensive The most typical command is simply a name of a program, perhaps followed by arguments (which are not interpreted by the shell, they are simply passed to the program as a form of input). Commands in this form are performed, conceptually, as follows (details may differ in actual implementations, e.g. point 2 may be done as part of point 4): 1. check that the program given is not the name of a builtin command or a construct (if so, it is processed differently) 2. the name of the program is taken to be a name of an executable file – a list of directories (given by ‹PATH›, which will be explained later) is searched to see if an executable file with a given name exists, 3. the shell performs a ‹fork› system call to create a new process (see lecture 3), 4. the child process uses ‹exec› to start executing the executable located in 2, passing in any command line arguments, 5. the main shell process does a ‹wait› (i.e. it suspends until the executed program terminates). This means that each command involves quite a lot of work, which is not a problem for interactive use, or reasonably-sized shell scripts. Executing many thousands of commands, especially if the commands themselves run quickly, may get a little slow though. │ Built-in Commands │ │ • ‹cd› change the working directory │ • ‹export› for setting up environment │ • ‹echo› print a message │ • ‹exec› replace the shell process (no ‹fork›) Some commands of the form ‹program [arguments]› are interpreted specially by the shell (i.e. they will not use the above ‹fork› + ‹exec› process). There are basically two separate reasons why this is done: 1. efficiency – some commands are used often, especially in scripts, and creating new processes all the time is expensive – this is purely an optimisation, and is the case of built-in commands like ‹echo› or ‹test›, 2. functionality – some effects cannot be (easily, reasonably) done by a child process, mainly because changes need to be done to the main shell process: usually, only the main process can do such changes – this is the case of ‹cd› (changes the ‘current working directory’ of the main process), ‹export› (changes the environment of the main process, see also below) or ‹exec› (performs an ‹exec› without a ‹fork›, destroying the shell). │ Variables / Parameters │ │ • variable names are made of letters and digits │ • «using» variables is indicated with ‹$› │ • setting variables does «not» use the ‹$› │ • all variables are global (except subshells) │ │ variable="some text" │ echo $variable Earlier, we have mentioned an idea of ‘placeholders’, in the context of scripts written down in notepads. Shells take that idea, quite literally, and turn it into what we call variables (at least in most programming languages; the ‘official’ terminology in shell is «parameters» – rather in line with the idea of a placeholder). Essentially, the shell maintains a mapping of names to values, where names are strings made of letters and digits and the values are arbitrary strings. To create or update a mapping, the following command is used: variable="some text" The quotes are not required unless there are spaces in the value. Whitespace around ‹=› is not allowed (writing ‹variable = value› is interpreted as the command ‹variable› with arguments ‹=› and ‹value›). │ Parameter Expansion / Variable Substitution │ │ • «variables» are substituted as «text» │ • ‹$foo› is simply replaced with the content of ‹foo› │ • «arithmetic» is not well supported in most shells │ ◦ or any expression syntax, e.g. relational operators │ ◦ consider the POSIX syntax ‹z=$((x + y))› Variables (parameters) can be used more or less anywhere in any command, including as the command name. This is achieved by writing a dollar sign, ‹$›, followed by the name of the variable (parameter), like this: echo $variable The command will print ‹some text›. The substitution is done in a purely textual manner (the process is also known as «parameter expansion»). After substitution, everything about variables is ‘forgotten’ in the sense that whether any part of the text came from a substitution or was present from the start makes no difference and cannot be detected. This may lead to surprises if the value of a variable contains whitespace (we will discuss this later). Coming from normal programming languages, a user may be tempted to write something like ‹$a + $b› in a shell. This will not work: if ‹a=7› and ‹b=3›, the above ‘expression’ will be interpreted as the command ‹7› with arguments ‹+› and ‹3›. To perform arithmetic in a shell script, the expression must be enclosed in ‹$(( … ))› – to make it a little less painful, variables inside ‹$(( … ))› do not need to be prefixed with ‹$›. They are still substituted as text though: a=3+1; echo $a = $((a)) will print ‹3+1 = 4› (and not e.g. an error because ‹a› is not a number).¹ However, substitutions within ‹$(( … ))› «without» dollar signs are «bracketed» – in particular, a=3+1; b=7; echo $((a * b)) will print ‹28›, since it is expanded as ‹$(((3+1) * 7))›. This is «not» the case for ‹$› substitutions: a=3+1; b=7; echo $(($a * $b)) will print ‹10›. ¹ Depending on the implementation, ‹a=3+b; b=7; echo $((a))› may or may not work (in the sense that it'll print ‹10›). │ Command Substitution │ │ • basically like «parameter substitution» │ • written as ‹`command`› or ‹$(command)› │ ◦ first «executes» the command │ ◦ and captures its standard output │ ◦ then replaces ‹$(command)› with the output Sometimes, it is desirable to «compute» a piece of a command, most commonly by running another program. This can be done using ‹$( … )›, e.g. ‹cat $(ls)›: 1. first, ‹ls› is executed as a shell command (since it is a name of a program, it will be ‹fork›'d and ‹exec›'d as normal), 2. the output of ‹ls› (the list of files in current directory, e.g. ‹foo.txt bar.txt›) is captured into a buffer, 3. the content of the buffer is substituted into the original command, i.e. ‹cat foo.txt bar.txt›, 4. the command is executed as normal. Like with parameter substitution, there are whitespace related caveats (see below). │ Quoting │ │ • whitespace is an «argument separator» in shell │ • multi-word arguments must be «quoted» │ • quotes can be double quotes ‹"x"› or single ‹'x'› │ ◦ double quotes allow variable «substitution» │ Quoting and Substitution │ │ • «whitespace» from variable substitution must be «quoted» │ ◦ ‹foo="hello world"› │ ◦ ‹ls $foo› is different than ‹ls "$foo"› │ • bad quoting is a very common source of «bugs» │ • consider also «filenames» with spaces in them An important feature of parameter (variable) substitution is that it is done before argument splitting. Hence, values which contain whitespace may be interpreted, after substitution, as multiple arguments. Sometimes, this is desirable, but quite often it is not. Consider ‹cat $file›, clearly the author expects ‹$file› to be substituted for a single filename. However, if the value is ‹foo bar›, the command will be expanded to ‹cat foo bar› and execute the program ‹cat› with arguments ‹foo› and ‹bar›. Quoting can be used to prevent this from happening. Consider the example on the slide above: the first command, ‹ls $foo› will expand into ‹ls hello world› and execute with: argv[ 0 ] = "ls" argv[ 1 ] = "hello" argv[ 2 ] = "world" In effect, like with the ‹cat› example, it will be looking for two separate files. The latter, ‹ls "$foo"›, will be executed as: argv[ 0 ] = "ls" argv[ 1 ] = "hello world" │ Special Variables │ │ • ‹$?› is the result of last command │ • ‹$$› is the PID of the current shell │ • ‹$1› through ‹$9› are positional parameters │ ◦ ‹$#› is the number of parameters │ • ‹$0› is the name of the shell – ‹argv[0]› Besides variables (parameters) that the users set themselves, the shell provides a few ‘special’ variables which contain useful information. Positional parameters refer to the command-line arguments given to the currently executing shell script. Here are a few more variables: • ‹$@› expands to all positional parameters, with special behaviour when double-quoted (each parameter is quoted separately), • ‹$*› same, but without the special quoting behaviour, • ‹$!› the PID of the last ‘background process’ (created with the ‹&› operator which will be discussed later), • ‹$-› shell options. │ Environment │ │ • is «like» shell variables but not the same │ • the environment is passed to «all» executed «programs» │ ◦ a child cannot modify environment of its parent │ • variables are moved into the environment by ‹export› │ • environment variables often act as «settings» POSIX has a concept of «environment variables», which are independent of any shell: they are passed around from process to process, both across ‹fork› and across ‹exec›. However, since ‹fork› makes a new copy of the entire environment, changes in those variables can only be passed down (to «new» child processes), never up (to parent processes), nor to already-running processes in general. Despite being formally independent of shell, environment variables have similar semantics: their names are alphanumeric strings and their content is arbitrary text. To further add to the confusion, shells treat environment variables in the same way they treat their ‘internal’ variables (parameters). If ‹FOO› is an environment variable, a shell will replace ‹$FOO› by its value, and executing ‹FOO=bar› as a shell command will change its value in the main shell process (and hence all of its future child processes). │ Important Environment Variables │ │ • ‹$PATH› tells the system where to find programs │ • ‹$HOME› is the home directory of the current user │ • ‹$EDITOR› and ‹$VISUAL› set which text editor to use │ • ‹$EMAIL› is the email address of the current user │ • ‹$PWD› is the current working directory By convention, environment variables are named in all-uppercase. There are a few ‘well-known’ variables which affect the behaviour of various programs: the ‹PATH› variable gives a list of directories in which to look for executables (when executing commands in a shell, but also when invoking programs by name from other programs). The ‹HOME› variable tells programs where to store per-user files (both data and configuration), and so on. Some are set by the system when creating the user session (‹HOME›, ‹LOGNAME›), others are set by the shell (‹PWD›), some are normally configured by the system administrator (but can be changed by users), like ‹PATH›, yet others are configured by the user (‹EDITOR›, ‹EMAIL›). │ Globbing │ │ • patterns for quickly «listing» multiple «files» │ • e.g. ‹ls *.c› shows all files ending in ‹.c› │ • ‹*› matches any number of characters │ • ‹?› matches one arbitrary character │ • works on entire «paths» – ‹ls src/*/*.c› Let us get back to shell and its syntax. Since files are ubiquitous and many commands expect file names as arguments, shells provide special constructs for working with them. One of those is «globbing», where a single «pattern» can replace a possibly long list of file names (and hence saves a lot of tedious typing). Glob expansion is done by the shell itself, i.e. the program receives individual file names as arguments, not the glob. Quotes (both single and double) prevent glob expansion (useful to pass strings which contain ‹*› or ‹?› as arguments). Unquoted strings with any of the glob ‘meta-characters’ is treated (and expanded) as a glob, including in results of parameter expansion (substitution). │ Conditionals │ │ • allows «conditional execution» of commands │ • ‹if cond; then cmd1; else cmd2; fi› │ • also ‹elif cond2; then cmd3; fi› │ • ‹cond› is also a command (the exit code is used) The most basic of all control flow constructs is «conditional execution», where a command is executed or skipped based on the outcome of a previous command. Shells use the traditional ‹if› keyword, optionally followed by ‹elif› and ‹else› clauses. Unlike most programming languages, ‹cond› is not an expression, but a regular command. If the command ‘succeeds’ (terminates with exit code 0), this is interpreted as ‘true’ and the ‹then› branch is taken. Otherwise, the ‹elif› branches are evaluated in turn (if present) and if none succeed, the ‹else› branch (again, if present) is executed. │ ‹test› (evaluating boolean expressions) │ │ • originally an «external program», also known as ‹[› │ ◦ nowadays «built-in» in most shells │ ◦ works around lack of expressions in shell │ • evaluates its arguments and returns ‹true› or ‹false› │ ◦ can be used with ‹if› and ‹while› constructs While the condition of an ‹if› statement (command) is a command, it would be often convenient to be able to specify expressions which relate variables to each other, or which check for presence of files. To this end, POSIX specifies a special program called ‹test› (actually built into most shells). The ‹test› command receives arguments like any other command, evaluates them to obtain a boolean value and sets its exit code based on this value, so that ‹if test …; then …› behaves as expected. │ ‹test› Examples │ │ • ‹test file1 -nt file2› → ‘nt’ = newer than │ • ‹test 32 -gt 14› → ‘gt’ = greater than │ • ‹test foo = bar› → string equality │ • combines with variable substitution (‹test $y = x›) There are 3 classes of predicates provided by ‹test›: 1. existence and properties of files, 2. integer comparisons, and 3. string comparisons. The latter two mimic what ‘normal’ programming languages provide (albeit with odd syntax). The first makes it easy and convenient to write commands that execute only if a particular file exists (or is missing), a very common task in shell programming. │ Loops │ │ • ‹while cond; do cmd; done› │ ◦ ‹cond› is a command, like in ‹if› │ • ‹for i in 1 2 3 4; do cmd; done› │ ◦ allows globs: ‹for f in *.c; do cmd; done› │ ◦ also command substitution │ ◦ ‹for f in $(seq 1 10); do cmd; done› After conditional execution, loops are the next most fundamental construct. Again, like in general-purpose programming languages, loops allow shell scripts to repeat a sequence of commands, either: 1. until a particular command fails (a ‹while› loop, the command in question often being ‹test›, though of course it can be any command), 2. once for each value in a list, often of file names (which can be in turn constructed by using globs). Another common form of the ‹for› loop uses «command substitution» (command expansion) to generate the list. An oft-used helper in this context is (sadly, non-standard) ‹seq› utility, which generates sequences of numbers. A similar (and likewise non-standard) utility called ‹jot› is available on BSD systems. │ Case Analysis │ │ • selects a command based on «pattern matching» │ • ‹case $x in *.c) cc $x;; *) ls $x;; esac› │ ◦ yes, ‹case› really uses unbalanced parens │ ◦ the ‹;;› indicates end of a case A slightly more advanced control flow construct is «case analysis», which allows the use of glob-like pattern matching on arbitrary strings (i.e. not just filenames). The string to match against is given after ‹case›, and is usually a result of parameter or command expansion. Note that the patterns after the ‹in› clause of the ‹case› statement are not glob-expanded into a list of filenames. │ Command Chaining │ │ • ‹;› (semicolon): run two commands in sequence │ • ‹&&› run the second command «if» the first succeeded │ • ‹||› run the second command «if» the first failed │ • e.g. compile and run: ‹cc file.c && ./a.out› While the straightforward command chaining operator ‹;› (semicolon) is perhaps too banal to call control flow, there are a few similar operators that are more interesting. The first set is the boolean combinators ‹&&› and ‹||› which essentially function like a short-hand syntax for ‹if› statements. Since commands combined with ‹&&› and ‹||› are again commands, these can appear in the condition clause of an ‹if› or a ‹while› statement. However, they are also useful standalone, and also in interactive mode. Especially ‹&&› can be used to type a sequence of commands that stops on the first failure, significantly cutting down on interaction latency (where the user waits for each command to complete, and after each command, the computer waits for the user to type in the next command). │ Pipes │ │ • shells can run «pipelines» of commands │ • ‹cmd1 | cmd2 | cmd3› │ ◦ all commands are run «in parallel» │ ◦ output of ‹cmd1› becomes input of ‹cmd2› │ ◦ output of ‹cmd2› is processed by ‹cmd3› │ │ echo hello world | sed -e s,hello,goodbye, Perhaps the most powerful feature of shells are «pipes», which offer a very flexible and powerful (even if very simple) way to combine multiple commands. The pipe operator causes both commands to be executed in parallel, and anything that the first program writes to its standard output is sent to the second program on its standard input. POSIX specifies a considerable number of utility programs specifically designed to work well in such pipelines, and many more are available as vendor-specific extensions or in 3rd-party software packages. │ Functions │ │ • you can also define «functions» in shell │ • mostly a light-weight «alternative» to «scripts» │ ◦ no need to ‹export› variables │ ◦ but cannot be invoked by non-shell programs │ • functions can also «set» variables Recall that the environment is only passed down, never back up. This means that a shell script setting a variable will not affect the parent shell. However, in functions (and when scripts are invoked using ‹.›), variables can be set and the effect of such changes is visible in the script that invoked the function. ## The Command Line While in some sense, the interactive aspects of shells are much more immediately important to users, they are not as theoretically interesting. You can learn more about the interactive shell by simply using it and discovering its features as you go and as you find them useful. That said, we will do a quick tour of the basic features most contemporary shells provide to make interactive use comfortable and efficient. │ Interactive Shell │ │ • the shell displays a «prompt» and waits │ • the user «types» in a «command» and hits enter │ • the command is «executed» immediately │ • «output» is printed to the «terminal» The interactive mode is characterised by a prompt–response cycle, where the shell prompts the user for a command, the user types in a command, confirms it, and the shell executes it in response to the confirmation. The standard output and standard error output (descriptors 1 and 2) and the standard input (descriptor 0) are connected to the terminal, i.e. to the display and the keyboard of the user. │ Command Completion │ │ • most shells let you use TAB to «auto-complete» │ ◦ works at least for command names and file names │ ◦ but “smart completion” is common │ • interactive history: hit ‘up’ to recall a command │ ◦ also interactive history search, e.g. ‹^R› in ‹bash› During interactive use, a significant portion of time is spent typing in commands. Hence, shells try quite hard to reduce the effort needed to input these commands. One of the early, and very efficient, features in this direction is ‘tab completion’, where: 1. the user types in a portion of a command name or a file name and hits tab, 2. the shell looks up all possible commands or file names with the given prefix, 3. if there is only one option, it completes the name, otherwise it offers a list, which the user may cycle through, or type in more letters to make the prefix unique. This saves time in two different ways: first, it is often faster to hit tab than to type in the remaining characters, and second, the user does not need to type in an extra command to list files, or find the exact name of the command. Besides command names and file names, many shells offer ‘smart completion’ which can complete arguments in a context-sensitive way, i.e. depending on the prefix of the command being written. For instance, typing ‹ifc^I ^I› (the TAB character is sometimes spelled as ‹^I›) might complete first the command to ‹ifconfig› (for configuring network interfaces) and then offer a list of devices available in the particular computer. The other major feature which saves typing is interactive history: when the user types a command, that command is saved in a ‘history file’. The last few commands can be easily recalled by simply hitting the up arrow, while it's also possible to interactively search the history using keywords. The logic is, that for a longer command, editing the existing command can be much faster than typing it out again in its entirety. │ Prompt │ │ • the string printed when shell «expects a command» │ • controlled by the ‹PS1› environment variable │ • usually shows your «username» and the «hostname» │ • or working «directory», battery status, time, weather, ... An important tool that helps the user orient themselves is the «prompt», which primarily serves to indicate that the shell is ready to accept a command. The secondary function of the prompt is to give the user some basic information: usually, the host name of the computer (it is very easy to use shells remotely), the login they are working under, and the current working directory are present. What is printed can be customized, and many shells can run arbitrary commands to compute the prompt to print. In that case, the prompt can include anything that fits on the line, including the current time, battery status, the exit code of the last command, the current weather, the active ‹git› branch, current CPU or memory utilization, and so on and so forth. │ Job Control │ │ • only one program can run in the «foreground» (terminal) │ • but a running program can be «suspended» (‹C-z›) │ • and «resumed» in background (‹bg›) or in foreground (‹fg›) │ • use ‹&› to run a command in background: ‹./spambot &› While «job control» is not essential on modern systems, it can be occasionally useful. The original motivation was that typically, user only had a single terminal with a single screen, and hence could only be running a single command at a time: the shell would be unavailable until the program terminated (since the standard IO of the program would be connected to the terminal). To improve the situation, shells allow programs to be executed in background, and continue interacting with the user while the program runs. Job control then allows the user to recall background programs into foreground, suspend the foreground program, and so on. Today, it's usually not a problem to open as many terminals as the user wants. │ Terminal │ │ • can «print text» and read text from a «keyboard» │ • normally everything is printed on the last line │ • the text could contain «escape» (control) sequences │ ◦ for printing colourful text or clearing the screen │ ◦ also for printing text at a «specific coordinate» The terminal itself is a key part of the interaction, though it is not part of the shell itself. Instead, shell uses terminal to do its input and output, like any other text-oriented program. While terminals used to be hardware devices, these days it's much more common to use ‘terminal emulators’, programs which behave like a traditional hardware terminal, but simply draw the content of the screen into a window. In normal use of a terminal, older text scrolls upwards: this is the mode used with a typical shell. Moreover, this scrollback behaviour is automatic in the terminal. However, full-screen terminal applications (which use coordinate-based printing) will not use the capability. This is usually achieved by printing special ‘escape sequences’ to the terminal, which are not printed as literal text, but instead encode instructions for the terminal, like moving the cursor around, or printing coloured text. │ Full-Screen Terminal Apps │ │ • applications can use the «entire» terminal «screen» │ • a library abstracts away the low-level «control sequences» │ ◦ the library is called ‹ncurses› for ‹new curses› │ ◦ different terminals use different control sequences Terminals are ‘character cell’ devices: the screen is divided into a non-overlapping grid of cells and each cell can display a single character or symbol. Terminals typically allow applications to disable automatic scrolling and then put characters anywhere on the screen: these capabilities make it possible to use the screen in an application-specific way. For instance, a text editor can display a section of the file being edited, and allow the user to move both up and down in the file, as they see fit. Clearly, such usage does not fit the model, where text is only printed on the last line and scrolls up automatically when the line fills or a newline is printed. Historically, different terminals used different escape sequences for the same (or related) feature. The features were also subtly different from vendor to vendor and even between different terminal models. For this reason, a library called ‹ncurses› translates high-level commands (put a red ‘a’ at given coordinates) into the low-level sequences based on the terminal the application is currently using. │ UNIX Text Editors │ │ • ‹sed› – stream editor, non-interactive │ • ‹ed› – line oriented, interactive │ • ‹vi› – visual, screen oriented │ • ‹ex› – line-oriented mode of ‹vi› A typical example of a full-screen terminal program is a text editor. However, this was not always so: the first commonly used ‘screen oriented’ text editor was ‹vi›.¹ An earlier editor, ‹ed›, was command-based, and to see a portion of the file, the user would have to type in a command to that effect. ¹ There are a couple screen-oriented editors which predate ‹vi›, though none of them survived the specific hardware and operating system they were written for. On the other hand, ‹vi› clones are still in common use today. │ TUI: Text User Interface │ │ • special characters exist to draw «frames» and «separators» │ • the program draws a «2D interface» on a terminal │ • these types of interfaces can be quite comfortable │ • they are often «easier to program» than GUIs │ • very low bandwidth requirements for «remote use» Using a special character set (and a special font), it is possible to draw simple graphics (rectangular frames) on the terminal. Full-screen programs which make use of such features are halfway to GUIs, and often offer menus, forms with text fields, checkboxes or buttons, dialog windows and other elements commonly seen in graphical programs. ## Graphical Interfaces Of course, modern operating systems¹ offer «graphical» user interfaces, based on a grid of millions of tiny pixels, instead of large character cells. Besides keyboard for entering text, pointing devices (mice, touchpads, touchscreens, …) are ubiquitous. Pixel-based display devices can display arbitrary pictures, though user interfaces traditionally stick with simple, rectangular shapes. ¹ At least those that offer some level of support for running on general-purpose end-user devices like desktops, laptops, tablets or smartphones. │ Windowing Systems │ │ • each application runs in its «own window» │ ◦ or possibly multiple windows │ • «multiple applications» can be shown on screen │ • windows can be moved around, resized &c. │ ◦ facilitated by frames around window content │ ◦ generally known as «window management» The central paradigm of earlier GUI systems was that of a «window», invented at Xerox PARC in the 70s and adopted into mainstream systems by the likes of Apple, Microsoft and Sun in the 80s. The system can display multiple applications at a time, each restricted to a window: a rectangular area of the screen that can be moved around, resized and that can overlap with other applications. │ Window-less Systems │ │ • especially popular on «smaller screens» │ • applications take the entire screen │ ◦ give or take status or control widgets │ • «task switching» via a dedicated screen While window-based systems dominated the computing world in the 90s and early 2000s, this started to change in 2007, with the arrival of the first iPhone. While of course it wasn't the first smartphone or the first small-screen device, it had an outsized impact on the computing landscape. The small screen of the iPhone made windowing impractical and essentially revived the ‘one application at a time’ mode of operation. Of course, the underlying operating system was fully capable of multitasking, and the computer did run a number of background tasks. The user interface provides ways to interact with those tasks (e.g. notifications) that give aspects of both single- and multi-tasking environments. The paradigm is now in common use on tablet computers and smartphones of all major manufacturers. │ A GUI Stack │ │ • graphics card «driver», mode setting │ • «drawing»/painting (usually hardware-accelerated) │ • multiplexing (e.g. using windows) │ • «widgets»: buttons, labels, lists, ... │ • «layout»: what goes where on the screen Displaying a character cell grid is a fairly simple affair: each letter and symbol that can be displayed is given a bitmap with the size of the grid cell. The picture on the screen is then glued from a non-overlapping grid of these small rectangular bitmaps. On the other hand, the graphical stack is much more complex. While arguably the underlying concept: small colored rectangles (pixels) are clearly simpler than character cells, the process of building useful pictures out of them is a lot more involved. │ Well-known GUI Stacks │ │ • Windows │ • macOS, iOS │ • X11 │ • Wayland │ • Android │ Portability │ │ • GUI ‘toolkits’ make «portability» easy │ ◦ Qt, GTK, Swing, HTML5+CSS, ... │ ◦ many of them run on «all major platforms» │ • «code» portability is not the only issue │ ◦ GUIs come with «look and feel» guidelines │ ◦ portable applications may «fail to fit» Different GUI stacks provide different APIs, different abstractions and different capabilities. Since software portability is also desired in GUI applications, programmers often use a «toolkit», which sits on top of the GUI stack and puts a uniform abstraction on top of it. This way, the application can run on different GUI stacks with a simple rebuild. However, there is a price: toolkits can get very complicated (hundreds of thousands of lines of code, in the case of the web stack currently running into many millions). │ Text Rendering │ │ • a surprisingly «complex» task │ • unlike terminals, GUIs use variable pitch fonts │ ◦ brings up issues like «kerning» │ ◦ hard to predict «pixel width» of a line │ • bad interaction with «printing» (cf. WYSIWIG) │ Bitmap Fonts │ │ • characters are represented as «pixel arrays» │ ◦ usually just black and white │ • traditionally pixel-drawn «by hand» │ ◦ very time consuming (many letters, sizes, variants) │ • the result is «sharp» but «jagged» (not smooth) │ Outline Fonts │ │ • Type1, TrueType – based on «splines» │ • they can be «scaled» to arbitrary pixel sizes │ • same font can be used for «screen» and for «print» │ • rasterisation is usually done in «software» │ Hinting, Anti-Aliasing │ │ • screens are «low resolution» devices │ ◦ typical HD displays have DPI around 100 │ ◦ laser printers have DPI of 300 or more │ • «hinting»: deform outlines to better fit a pixel grid │ • «anti-aliasing»: smooth outlines using grayscale │ X11 (X Window System) │ │ • a traditional UNIX windowing system │ • provides a C API (‹xlib›) │ • built-in «network transparency» (socket-based) │ • core protocol version 11 from 1987 │ X11 Architecture │ │ • X «server» provides graphics and input │ • X «client» is an application that uses X │ • a «window manager» is a (special) client │ • a «compositor» is another special client │ Remote Displays │ │ • «application» is running on computer A │ • the display is «not» the console of A │ ◦ could be a dedicated «graphical terminal» │ ◦ could be another «computer» on a LAN │ ◦ or even across the internet │ Remote Display Protocols │ │ • one approach is «pushing pixels» │ ◦ VNC (Virtual Network Computing) │ • X11 uses a custom «drawing» protocol │ • others use «high-level» abstractions │ ◦ NeWS (PostScript-based) │ ◦ HTML5 + JavaScript │ VNC (Virtual Network Computing) │ │ • sends «compressed pixel data» over the wire │ ◦ can leverage regularities in pixel data │ ◦ can send «incremental updates» │ • and «input events» in the other direction │ • no support for «peripherals» or file sync Basically the only virtue of VNC is simplicity. Security is an afterthought and not super-compatible across implementations. It is mainly designed for low-bandwidth, high-latency networks (i.e. the Internet). │ RDP (Remote Desktop Protocol) │ │ • more sophisticated than VNC (but proprietary) │ • can also send «drawing commands» over the wire │ ◦ like X11, but using DirectX drawing │ ◦ also allows remote «OpenGL» │ • support for audio, remote USB &c. RDP is primarily based on the pixel-pushing paradigm, but there is a number of extensions that allow sending high-level rendering commands for local, hardware-accelerated processing. In some setups, this includes remote accelerated OpenGL and/or Direct3D. │ SPICE │ │ • Simple Protocol for Independent Computing Env. │ • open protocol somewhere between VNC and RDP │ • can send OpenGL (but only over a «local socket») │ • two-way «audio», USB, «clipboard» integration │ • still mainly based on «pushing» (compressed) «pixels» │ Remote Desktop Security │ │ • the user needs to be «authenticated» over network │ ◦ passwords are easy, biometric data less so │ • the data stream should be «encrypted» │ ◦ not part of the X11 or NeWS protocols │ ◦ or even HTTP by default (used for HTML5/JS) For instance, RDP in Windows 10 does not support fingerprint logins (it was supported on earlier versions, but was disabled due to security flaws). │ Review Questions │ │ 33. What is a shell? │ 34. What does variable substitution mean? │ 35. What is an environment variable? │ 36. What belongs into the GUI stack?