# Vstup a výstup Demonstrations: 1. ‹files› – opening files, reading and writing strings 2. ‹streams› – from values to strings and back 3. ‹format› – overloading formatting operators Elementary exercises: 1. ‹force› – input and output operators Preparatory exercises: 1. ‹set› – read and write sets of numbers, 2. ‹fixnum› – fixed point numbers with formatted IO, 3. ‹tmpfile› – an auto-erasing temporary file 4. ‹parse› – parser for a simple low-level language, 5. ‹grep› – print matching lines, 6. ‹csv› – parse comma-separated numeric data. Regular exercises: 1. ‹xxx› 2. ‹xxx› 3. ‹xxx› 4. ‹xxx› 5. ‹json› – format a string → string map as JSON 6. ‹cpp› † – a very simple C preprocessor ## d. Demonstrace (ukázky) ### 1. [‹files›] This example will be brief: we will show how to open a file for reading and fetch a line of text. We will then write that line of text into a new file and read it back again to check that things worked. We will split up the example into functions for 2 reasons: first, to make it easier to follow, and second, to take advantage of RAII: the file streams will close the underlying resource when they are destroyed. In this case, that will be at the end of each function. std::string read( const char *file ) /* C */ { The default method of doing IO in C++ is through «streams». Reading files is done through a stream of type ‹std::ifstream›, which is short for «input file stream». The constructor of ‹ifstream› takes the name of the file to open. We will use a file given to us by the caller. std::ifstream f( file ); /* C */ The simplest method to read text from a file is using ‹std::getline›, which will fetch a single line at a time, into an ‹std::string›. We need to prepare the string in advance, since it is passed into ‹std::getline› as an output argument. std::string line; /* C */ The ‹std::getline› function returns a reference to the stream that was passed to it. Additionally, the stream can be converted to ‹bool› to find out whether everything is okay with it. If the reading fails for any reason, it will evaluate to ‹false›. The newline character is discarded. if ( !std::getline( f, line ) ) /* C */ In real code, we would of course want to handle errors, because opening files is something that can fail for a number of reasons. Here, we simply assume that everything worked. assert( false ); /* C */ return line; /* C */ } Next comes a function which demonstrates writing into files. void write( const char *file, std::string line ) /* C */ { To write data into a file, we can use ‹std::ofstream›, which is short for «output file stream». The output file is created if it does not exist. std::ofstream f( file ); /* C */ Writing into a file is typically done using operators for «formatted output». We will look at those in more detail in the next section. For now, all we need to know that writing an object into a stream is done like this: f << line; /* C */ We will also want to add the newline character that ‹getline› above chomped. We have two options: either use the ‹"\n"› string literal, or ‹std::endl› -- a so-called «stream manipulator» which sends a newline character and asks the stream to send the bytes to the operating system. Let's try the more idiomatic approach, with the manipulator: f << std::endl; /* C */ At this point, the file is automatically closed and any outstanding data is sent to the operating system. } /* C */ int main() /* demo */ /* C */ { We first use ‹read› to get the first line of a random file. std::string line = read( "zz.include.txt" ); /* C */ And we check that the line we got is what we expect. Remember the stripped newline. assert( line == "#ifdef foo" ); /* C */ Now we write the line into another file. After you run this example, you can inspect ‹files.out› with an editor. It should contain a copy of the first line of this file. write( "d5_files.out", line ); /* C */ Finally, we use ‹read› again to read "file.out" back, and check that the same thing came back. std::string check = read( "d5_files.out" ); /* C */ assert( check == line ); } ┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄ ### 2. [‹streams›] File streams are not the only kind of IO streams that are available in the standard library. There are 3 ‘special’ streams, called ‹std::cout›, ‹std::cerr› and ‹std::cin›. Those are not types, but rather global variables, and represent the standard output, the standard error output and the standard input of the program. However, the first two are instances of ‹std::ostream› and the third is an instance of ‹std::istream›. We don't know about class inheritance yet, but it is probably not a huge stretch to understand that instances of ‹std::ofstream› (output «file» stream) are also at the same time instances of ‹std::ostream› (general output stream). The same story holds for ‹std::ifstream› (input file stream) and ‹std::istream› (general input stream). There is another pair of classes: ‹std::ostringstream› and ‹std::istringstream›. Those streams are not attached to OS resources, but to instances of ‹std::string›: in other words, when you write to an ‹ostringstream›, the resulting bytes are not sent to the operating system, but are instead appended to the given string. Likewise, when you read from an ‹istringstream›, the data is not pulled from the operating system, but instead come from an ‹std::string›. Hopefully, you can see the correspondence between files (the content of which are byte sequences stored on disk) and strings (the content of which are byte sequences stored in RAM). In any case, string streams are ideal for playing around, because we can use the same tools as we always do: create some simple instances, apply operations and use ‹assert› to check that the results are what we expect. String-based streams are defined in the header ‹sstream›. Everything that we will do with string streams applies to other types of streams too (i.e. the 3 special streams mentioned earlier, and all file streams). Like in the previous example, we will split up the demonstration into a few sections, mainly to avoid confusion over variable names. We will first demonstrate reading from streams. We have already seen ‹std::getline›, so let's start with that. It is probably noteworthy that it works on any input stream, not just ‹std::ifstream›. void getline_1() /* C */ { std::istringstream istr( "a string\nwith 2 lines\n" ); /* C */ std::string s; assert( std::getline( istr, s ) ); /* C */ assert( s == "a string" ); assert( std::getline( istr, s ) ); assert( s == "with 2 lines" ); assert( !std::getline( istr, s ) ); assert( s.empty() ); } We can also override the delimiter character for ‹std::getline›, to extract delimited fields from input streams. void getline_2() /* C */ { std::istringstream istr( "colon:separated fields" ); std::string s; assert( std::getline( istr, s, ':' ) ); /* C */ assert( s == "colon" ); assert( std::getline( istr, s, ':' ) ); assert( s == "separated fields" ); assert( !std::getline( istr, s, ':' ) ); } So far so good. Our other option is so-called «formatted input». The standard library doesn't offer much in terms of ready-made overloads for such inputs: there is one for strings, which extracts individual words (like the ‹scanf› specifier ‹%s›, if you remember that from C, but the C++ version is actually safe and it is okay to use it). Then there is an instance for ‹char›, which extracts a single character (regardless of whether it is a whitespace character or not) and a bunch of overloads for various numeric types. void formatted_input() /* C */ { std::istringstream istr( "integer 123 float 3.1415 s t" ); std::string s, t; int i; float f; istr >> s; assert( s == "integer" ); /* C */ istr >> i; assert( i == 123 ); istr >> s; assert( s == "float" ); Notice that ‹float› numbers are not very exact. They are usually just 32 bits, which means 24 bits of precision, which is a bit less than 8 decimal digits. istr >> f; assert( std::fabs( f - 3.1415 ) < 1e-7 ); /* C */ The last thing we want to demonstrate with regards to the formatted input operators is that we can «chain» them. The values are taken from left to right (behind the scenes, this is achieved by the formatted input operator returning a reference to its left operand. istr >> s >> t; /* C */ assert( s == "s" && t == "t" ); When we reach the end of the stream (i.e. the end of the buffer, or of the file), the stream will indicate an error. A stream in error condition converts to ‹false› in a ‹bool› context. assert( !( istr >> s ) ); /* C */ } Output is actually quite a bit simpler than input. It is almost always reasonable to use formatted output, since strings are simply copied to the output without alterations. void formatted_output() /* C */ { std::ostringstream a, b, c; a << "hello world"; To read the buffer associated with an output string stream, we use its method ‹str›. Of course, this method is not available on other stream types: in those cases, the characters are written to files or to the terminal and we cannot access them through the stream anymore. assert( a.str() == "hello world" ); /* C */ Like with formatted input, output can be chained. b << 123 << " " << 3.1415; /* C */ assert( b.str() == "123 3.1415" ); When writing delimited values to an output stream, it is often desirable to only put the delimiter between items and not after each item: this is an endless source of headaches. Here is a trick to do it without too much typing: int i = 0; /* C */ for ( int v : { 1, 2, 3 } ) c << ( i++ ? ", " : "" ) << v; assert( c.str() == "1, 2, 3" ); /* C */ } ┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄ ### 3. [‹format›] We have seen the basics of input and output, and that formatted input and output is realized using operators. Like many other operators in C++, those operators can be overloaded. We will show how that works in this example. We will revisit the ‹cartesian› class from last week, to represent complex numbers in algebraic form, i.e. as a sum of a real and an imaginary number. We do not care about arithmetic this time: we will only implement a constructor and the formatted input and output operators. We will, however, need equality so that we can write test cases. class cartesian /* C */ { double real, imag; public: We have seen default arguments before: those are used when no value is supplied by the caller. This also allows instances to be default-constructed. cartesian( double r = 0, double i = 0 ) : real( r ), imag( i ) {} /* C */ The comparison is fuzzy, due to the limited precision available in ‹double›. friend bool operator==( cartesian a, cartesian b ) /* C */ { return std::fabs( a.real - b.real ) < 1e-10 && std::fabs( a.imag - b.imag ) < 1e-10; } Now the formatted output, which is a little easier than the input. Since the first operand of this operator is «not» an instance of ‹cartesian›, the operator «cannot» be implemented as a method. It must either be a function outside the class, or use the ‘friend trick’. Since we will need to access private attributes in the operator, we will use the ‹friend› syntax here. The return type and the type of the first argument are pretty much given and are always the same. You could consider them part of the syntax. The second argument is an instance of our class (this would often be passed as a ‹const› reference). friend std::ostream &operator<<( std::ostream &o, cartesian c ) /* C */ { We will use ‹27.3±7.1*i› as the output format. We can use ‘simpler’ overloads of the ‹<<› operator to build up ours: this is a fairly common practice. We write to the ‹ostream› instance given to us in the argument. We must not forget to return that instance to our caller. o << c.real; /* C */ if ( c.imag >= 0 ) o << "+"; return o << c.imag << "*i"; } The input operator is similar. It gets a reference to an ‹std::istream› as an argument (and has to pass it along in the return value). The main difference is that the object into which we read the data must be passed as a non-constant (i.e. mutable) reference, since we need to change it. friend std::istream &operator>>( std::istream &i, cartesian &c ) /* C */ { Like above, we will build up our implementation from simpler overloads of the same operator (which all come from the standard library). The formatted input operators for numbers do not require that the number is followed by whitespace, but will stop at a character which can no longer be part of the number. A ‹+› or ‹-› character in the middle of the number qualifies. i >> c.real; /* C */ We will slightly abuse the flexibility of the formatted input operator for ‹double› values: it accepts numbers starting with an explicit ‹+› sign, hence we do not need to check the sign ourselves. Just read the imaginary part. i >> c.imag; /* C */ We do need to deal with the trailing ‹*i› though. char ch; /* C */ When formatted input fails, it should set a ‹failbit› in the input stream. This is how the ‹if ( stream >> value )› construct works. if ( !( i >> ch ) || ch != '*' || /* C */ !( i >> ch ) || ch != 'i' ) i.setstate( i.failbit ); And as mentioned above, we need to return a reference to the input stream. return i; /* C */ } }; /* C */ int main() /* demo */ /* C */ { std::ostringstream ostr; ostr << cartesian( 1, 1 ); We first check that the output behaves as we expected. assert( ostr.str() == "1+1*i" ); /* C */ We write a few more complex numbers into the stream, using operator chaining. ostr << " " << cartesian( 3, 0 ) << " " << cartesian( 1, -1 ) /* C */ << " " << cartesian( 0, 0 ); assert( ostr.str() == "1+1*i 3+0*i 1-1*i 0+0*i" ); /* C */ We now construct an input stream from the string which we created above, and check that the values can be read back. std::istringstream istr( ostr.str() ); /* C */ cartesian a, b, c; Let's read back the first number and check that the result makes sense. assert( istr >> a ); /* C */ assert( a == cartesian( 1, 1 ) ); We can also check that chaining works as expected, using the remaining numbers in the string. assert( istr >> a >> b >> c ); /* C */ assert( a == cartesian( 3, 0 ) ); /* C */ assert( b == cartesian( 1, -1 ) ); assert( c == cartesian( 0, 0 ) ); We can reset an ‹istringstream› by calling its ‹str› method with a new buffer. We want to demonstrate that trying to read an ill-formatted complex number will fail. std::istringstream bad1( "7+3*j" ); /* C */ assert( !( bad1 >> a ) ); std::istringstream bad2( "7" ); /* C */ assert( !( bad2 >> a ) ); } ## e. Elementární příklady ### 1. [‹force›] This week in the physics department, we will deal with formatting and parsing vectors (forces, just to avoid confusion with ‹std::vector›... for now). The class will be called ‹force›, and it should have a constructor which takes 3 values of type ‹double› and a default constructor which constructs a 0 vector. In addition to that, it should have a (fuzzy) comparison operator and formatting operators, both for input and for output. Use the following format: ‹[F_x F_y F_z]›, that is, a left square bracket, then the three components of the force separated by spaces, and a closing square bracket. Do not forget to set ‹failbit› in the input stream if the format does not match expectations. class force; /* C */ ## p. Přípravy ### 1. [‹set›] Implementujte typ ‹set›, který reprezentuje množinu libovolných celých čísel, s operacemi: • ‹add› – přidá prvek, • ‹has› – ověří přítomnost prvku, • ‹size› – vrátí počet prvků. Dále nechť je hodnoty typu ‹set› možné číst z a zapisovat do vstupně-výstupních proudů. Na výstupu budou mít množiny tuto formu: {} { 1 } { 1, 2 } Na vstupu akceptujte i varianty s jiným počtem bílých znaků (včetně žádných). struct set; /* C */ ### 2. [‹fixnum›] V tomto příkladu se vrátíme k typu ‹fixnum› z předchozí kapitoly. Jedná se o typ, který reprezentuje čísla s pevnou desetinnou čárkou, konkrétně tvaru ‹123456.78›, se 6 desítkovými číslicemi před a dvěma za desetinnou čárkou, a s těmito operacemi: • sčítání, odečítání a násobení (operátory ‹+›, ‹-› a ‹*›), • sestrojení z celého čísla (implicitně nula), • přiřazení kopií, • srovnání dvou hodnot operátory ‹==› a ‹!=›, • čtení a zápis čísel z vstupně-výstupních proudů. Všechny aritmetické operace nechť zaokrouhlují směrem k nule na nejbližší reprezentovatelné číslo. struct fixnum; /* C */ ### 3. [‹tmpfile›] We will implement a simple wrapper around ‹std::fstream› that will act as a temporary file. When the object is destroyed, use ‹std::remove› to unlink the file. Make sure the stream is closed before you unlink the file. The ‹tmp_file› class should have the following interface: • a constructor which takes the name of the file • method ‹write› which takes a string and replaces the content of the file with that string; this method should flush the data to the operating system (e.g. by closing the stream) • method ‹read› which returns the current content of the file • method ‹stream› which returns a reference to an instance of ‹std::fstream› (i.e. suitable for both reading and writing) Calling both ‹stream› and ‹write› on the same object is undefined behaviour. The ‹read› method should return all data sent to the file, including data written to ‹stream()› that was not yet flushed by the user. class tmp_file; /* C */ ┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄ ### 4. [‹parse›] Write a simple parser for an assembly-like language with one instruction per line (each taking 2 operands, separated by spaces, where the first is always a register and the second is either a register or an ‘immediate’ number). The opcodes (instructions) are: ‹add›, ‹mul›, ‹jnz›, the registers are ‹rax›, ‹rbx› and ‹rcx›. The result is a vector of ‹instruction› instances (see below). Set ‹r_2› to ‹reg::immediate› if the second operand is a number. If the input does not conform to the expected format, throw ‹no_parse›, which includes a line number with the first erroneous instruction and the kind of error (see ‹enum error›), as public attributes ‹line› and ‹type›, respectively. If multiple errors appear on the same line, give the one that comes first in the definition of ‹error›. You can add attributes or methods to the structures below, but do not change the enumerations. enum class opcode { add, mul, jnz }; /* C */ enum class reg { rax, rbx, rcx, immediate }; enum class error { bad_opcode, bad_register, bad_immediate, bad_structure }; struct instruction /* C */ { opcode op; reg r_1, r_2; int32_t immediate; }; struct no_parse /* C */ { int line; error type; }; std::vector< instruction > parse( const std::string & ); /* C */ #include /* C */ ┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄ ### 5. [‹grep›] To practice working with IO streams a little, we will write a two simple functions which reads lines from an input stream, process them a little and possibly print them out or their part into an output stream. The ‹grep› function checks, for every line on the input, whether it matches a given ‹pattern› (i.e. the pattern is a substring of the line) and if it does (and only if it does) copies the line to the output stream. void grep( std::string pattern, std::istream &, std::ostream & ); /* C */ The other function to add is called ‹cut› and it will process the lines differently: it splits each line into fields separated by the character ‹delim› and only prints the column given by ‹col›. Unlike the ‹cut› program, index columns starting at 0. If there are not enough columns on a given line, print an empty line. void cut( char delim, int col, std::istream &, std::ostream & ); /* C */ ### 6. [‹csv›] In this exercise, we will deal with CSV files: we will implement a class called ‹csv› which will read data from an input stream and allow the user to access it using the indexing operator. The exception to throw in case of format error. class bad_format; /* C */ The constructor should accept a reference to ‹std::istream› and the expected number of columns. In the input, each line contains integers separated by value. The constructor should throw an instance of ‹bad_format› if the number of columns does not match. Additionally, if ‹x› is an instance of ‹csv›, then ‹x.at( 3, 1 )› should return the value in the third row and first column. class csv; /* C */ ## r. Řešené úlohy ### 5. [‹json›] You are given a single-level string → string dictionary. Turn it into a single string, using JSON as the format. Take care to escape special characters – at least double quote and the escape character (backslash) itself. In JSON, key order is not important – emit them in iteration (alphabetic) order. Put a single space after each ‘element’: after the opening brace, after colons and after commas, except if the input is empty, in which case the output should be just ‹{}›. using str_dict = std::map< std::string, std::string >; /* C */ std::string to_json( const str_dict &dict ); /* C */ ┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄ ### 6. [‹cpp›] Implement a (very simplified) C preprocessor which supports ‹#include "foo"› (without a search path, working directory only), ‹#define› without a value, ‹#undef›, ‹#ifdef› and ‹#endif›. The input is provided in a file, but the output should be returned as a string. PS: Do not include line and filename information that ‹cpp› normally adds to files. std::string cpp( const std::string &filename ); /* C */ If you run this program with a parameter, it'll preprocess that file and print the result to stdout. Feel free to experiment. int main( int argc, const char **argv ) /* C */ { if ( argc >= 2 ) std::cout << cpp( argv[ 1 ] ); else { std::string actual_1 = cpp( "zz.preproc_1.txt" ), expect_1 = "included foo\n" "included bar\n" "xoo\n" "foo\n", actual_2 = cpp( "zz.preproc_2.txt" ), expect_2 = "included bar\n" "included baz\n" "included bar\n"; assert( actual_1 == expect_1 ); /* C */ assert( actual_2 == expect_2 ); } return 0; /* C */ }