Safe HaskellSafe

Du

Description

Third assignment for IB016, semester spring 2019

Implementing du

In this assignment you will implement a simplified version of the Unix utility du. This utility can be used to detect filesystem usage by files and directories. You should create a standalone executable module. This time, only the outer specification in given, you have to do the functional decomposition yourself.

Note: Doing the functional decomposition is actually a part of your task. There is a separate submission folder (odevzdávárna) to which you are required to upload a file with function names and type signatures. See the interactive outline for details.

du usage / commandline options

Your program usage should be:

./du [options] [files-or-directories]

You should implement following options:

-a, --all             write counts for all files, not just directories
-c, --total           produce a grand total
-d, --max-depth=N     print  the  total  for  a directory (or file, with --all)
                      only if it is N or fewer levels below the command line
                      argument;  --max-depth=0 is the same as --summarize
-h, --human-readable  print sizes in human readable format (e.g., 1.0KiB 234MiB 2GiB)
    --si              like -h, but use powers of 1000 not 1024 (e.g., 1.0kB, 245MB, 2.1GB)
-s, --summarize       display only a total for each argument
    --help            display help and exit

You don't need to handle invalid options, you can ignore them.

You may assume that the short -d option is always immediately followed by its argument and the long --depth option is separated from its argument by a single equal sign ('='). E.g. -d0 -d10 --max-depth=0 --max-depth=10.

Combination of -s and -a is not valid, and need not be handled, the same holds for --max-depth=0 and --all.

du behaviour

Then du is run without any options, it prints sizes of all its commandline arguments: for files the size is printed directly, for directories their size is summarized recursively. By default, files inside directories are not printed. If any options are given, they precede all files and directories (you can take that for granted and don't have to check). If no files or directories are given, du should work with current working directory (./).

By default, sizes are printed in kibibytes without the unit (1 KiB = 1024 B). With --human-readable (or -h) sizes are printed with an appropriate unit (using binary prefixes https://en.wikipedia.org/wiki/Binary_prefix) such that the value is between 1 and 1023. If --si is given, sizes are handled similarly but using 1000-based SI prefixes (and value should be between 1 and 999). If the value is less then 10, one decimal place is displayed, otherwise there are no decimals (though rounding can be arbitrary).

All the other command line options should be handled according to usage given above. If --help is given, a short help summarizing options and usage should be displayed and all other options should be ignored.

Bonus (+3 points)

Optionally, as a bonus, you can also implement:

--exclude=PATTERN     exclude files and directories that match PATTERN

PATTERN is a shell pattern (not a regular expression): ? matches any one character and * matches any string (including an empty one). For example, ./du --exclude="*.swp" does not count any files that end in (or are in a subdirectory ending in) .swp. (You may use module System.FilePath.Glob. It supports somewhat richer patterns, but so does the actual du.)

The option may be given more than once. Files are excluded if they match any of the patterns. Command line arguments are subject to this option.

Further notes

  • In basic execution (without -s), subdirectories are printed.
  • In case of error (such as permission error or directory vanishing before it can be explored) program should not stop but print an error message (on stderr), you should handle only IOException and you can use that it is an instance of Show.
  • You can ignore anything that is neither file, nor directory (such as devices, symlinks, pipes,…).
  • You should NOT ignore hidden files (on Unix beginning with .).
  • The original Linux du is calculating file sizes based on disk allocation, sizes reported by hFileSize can differ (that is OK).
  • You can assume no files or directories are named as valid options (e.g. there is no file named '--all').

Module and package constraints

You can use any modules from any packages, but all used packaged (except base) must be noted in the header of this file next to your name and UČO.

Tips and tricks

  • For the recursive traversal, functions from System.Directory may be handy.
  • File size can be obtained by calling hFileSize from System.IO or alternatively by fileSize from System.Posix.Files. However, the latter case is not multiplatform).
  • You can use Text.Printf for formating.
  • Think twice before you start writing the code. Doing a proper functional decomposition will save you a lot of work/refactoring. Think of the functions you'll need, write their type signatures, submit them, and only then start programming.
  • You may use monoids for command line arguments processing, but you don't have to.
  • There are many useful general-purpose packages on Hackage, see for example the MissingH package. If there is something reasonably common you want, try to search Hackage first (but do not install everything just for a small function).

Examples

Order of files and directories on same level in hierarchy is not relevant and can differ on your system. Also the output in case of error need not match literally.

$ ./du --help
usage: du [options] [files]
  -a, --all             write counts for all files, not just directories
      --si              like -h, but use powers of 1000 not 1024
  -h, --human-readable  print sizes in human readable format (e.g., 1K 234M 2G)
  -s, --summarize       display only a total for each argument
  -d, --max-depth       print  the  total  for  a directory (or file, with --all) only if it is N or fewer levels below the command line argument;  --max-depth=0 is the same as --summarize
  -c, --total           produce a grand total
      --help            display this help and exit

$ mkdir test; cd test
$ mkdir -p first/second third
$ dd if=/dev/zero of=a bs=1024 count=100 &> /dev/null
$ dd if=/dev/zero of=first/b bs=1024 count=200 &> /dev/null
$ dd if=/dev/zero of=first/c bs=1024 count=300 &> /dev/null
$ dd if=/dev/zero of=first/second/d bs=1024 count=1024 &> /dev/null

$ ../du
0  ./third
1024   ./first/second
1524   ./first
1624   .

$ ../du first third
1024   first/second
1524   first
0  third

$ ../du -c first third
1024   first/second
1524   first
0      third
1524   total

$ ../du first
1024      first/second
1524      first

$ ../du -s first
1524    first

$ ../du --summarize first
1524    first

$ ../du -h first
1.0 MiB   first/second
1.4 MiB   first

$ ../du --si first
1.1 MB    first/second
1.5 MB    first

$ ../du -h -s -c first a
1.4 MiB   first
100 KiB   a
1.5 MiB   total

$ ../du -a -h first
200 KiB   first/b
300 KiB   first/c
1.0 MiB   first/second/d
1.0 MiB   first/second
1.4 MiB   first

$ mkdir fourth && chmod -r fourth
$ ../du fourth first
error: fourth: getDirectoryContents: permission denied (Permission denied)
1024   first/second
1524   first/

$ ../du fifth first
error: fifth: openFile: does not exist (No such file or directory)
1024   first/second
1524   first

$ ../du -d1
error: ./fourth: getDirectoryContents: permission denied (Permission denied)
0  ./third
1524   ./first
1624   .

$ ../du --max-depth=1 --human-readable
error: ./fourth: getDirectoryContents: permission denied (Permission denied)
0.0 B  ./third
1.4 MiB  ./first
1.5 MiB

$ ../du -a --exclude="*th*" --exclude="?eco??"
100	./a
300	./first/c
200	./first/b
500	./first
600	.

Documentation

main :: IO () #