{- | Third assignment for IB016, semester spring 2017, 20 points (+bonus). == Implementing du In this assignment you will implement a simplified version of unix utility @du@. This utility can be used to detect filesystem usage by files and directories. You should create a standalone executable module. This time, only the outer specification in given, you have to do the functional decomposition yourself. === du usage / commandline options (20 points) Your program usage should be: @ ./du [options] [files-or-directories] @ You should implement following options: @ -a, --all write counts for all files, not just directories --si like -h, but use powers of 1000 not 1024 (e.g., 1KB, 245MB, 2.1GB) -h, --human-readable print sizes in human readable format (e.g., 1KiB 234MiB 2GiB) -s, --summarize display only a total for each argument -c, --total produce a grand total --help display help and exit @ You don't need to handle invalid options, you can ignore them. === du behaviour Then @du@ is run without any options, it prints sizes of all its commandline arguments: for files the size is printed directly, for directories their size is summarized recursively. By default, files inside directories are not printed. If any options are given, they precede all files and directories (you can take that for granted and don't have to check). If no files or directories are given, @du@ should work with current working directory (./). By default, sizes are printed in kibibytes without the unit (1 KiB = 1024 B). With --human-readable (or -h) sizes are printed with an appropriate unit (using binary prefixes ) such that the value is between 1 and 1023. If --si is given, sizes are handled similarly but using 1000-based SI prefixes (and value should be between 1 and 999). If the value is less then 10, one decimal place is displayed, otherwise there are no decimals (though rounding can be arbitrary). All the other command line options should be handled according to usage given above. If --help is given, a short help summarizing options and usage should be displayed and all other options should be ignored. === Bonus (+5 points) Optionally, as a bonus, you can also implement: @ -d, --max-depth print the total for a directory (or file, with --all) only if it is N or fewer levels below the command line argument; --max-depth=0 is the same as --summarize @ In this case, you can assume that option and its value is not separated by space — for short version number follows immediately, while in long option it is separated by @'='@: @-d0 -d10 --max-depth=0 --max-depth=10@. Combination of @-s@ and @-a@ is not valid, and need not be handled, the same holds for --max-depth=0 and --all. ==== Further notes * In basic execution (without @-s@), subdirectories are printed. * In case of error (such as permission error or directory vanishing before it can be explored) program should not stop but print an error message (on 'stderr'), you should handle only 'IOException' and you can use that it is an instance of 'Show'. * You can ignore anything that is neither file, nor directory (such as devices, symlinks, pipes,…). * You should not ignore hidden files (on unix beginning with '.'). * The original linux @du@ is calculating file sizes based on disk allocation, sizes reported by 'hFileSize' can differ (that is OK). * You can assume no files or directories are named as valid options (e.g. there is no file named '--all'). == Module and package constraints You can use any modules from any packages, but all used packaged (except base) have to be noted in the header of this file next to your name and UID. To get the list of used packages on linux conveniently, you can use the following command. @ ghc .hs -n -hide-all-packages 2>&1 | grep package | sed 's/^[^‘]*‘\([^’@]*\).*/\1/' | sort | uniq @ === Tips and tricks * For the recursive traversal, functions from @System.Directory@ may be handy. * File size can be obtained by calling 'hFileSize' from @System.IO@ or alternatively by 'fileSize' from @System.Posix.Files@. However, the latter case is not multiplatform). * You can use @Text.Printf@ for formating. * Think twice before you start writing the code. Doing a proper functional decomposition will save you a lot of work/refactoring. Think of the functions you'll need, write their type signatures and only then start programming. * You may use monoids for command line arguments processing, but you don't have to. * There are many useful general-purpose packages on Hackage, see for example the package. If there is something reasnably common you want, try to search Hackage first (but do not install everything just for a small function). ==== Examples Order of files and directories on same level in hierarchy is not relevant and can differ on your system. Also the output in case of error need not match literally. > $ ./du --help > usage: du [options] [files] > -a, --all write counts for all files, not just directories > --si like -h, but use powers of 1000 not 1024 > -h, --human-readable print sizes in human readable format (e.g., 1K 234M 2G) > -s, --summarize display only a total for each argument > -d, --max-depth print the total for a directory (or file, with --all) only if it is N or fewer levels below the command line argument; --max-depth=0 is the same as --summarize > -c, --total produce a grand total > --help display this help and exit > > $ mkdir test; cd test > $ mkdir -p first/second third > $ dd if=/dev/zero of=a bs=1024 count=100 &> /dev/null > $ dd if=/dev/zero of=first/b bs=1024 count=200 &> /dev/null > $ dd if=/dev/zero of=first/c bs=1024 count=300 &> /dev/null > $ dd if=/dev/zero of=first/second/d bs=1024 count=1024 &> /dev/null > > $ ../du > 0 ./third > 1024 ./first/second > 1524 ./first > 1624 . > > $ ../du first third > 1024 first/second > 1524 first > 0 third > > $ ../du -c first third > 1024 first/second > 1524 first > 0 third > 1524 total > > $ ../du first > 1024 first/second > 1524 first > > $ ../du -s first > 1524 first > > $ ../du --summarize first > 1524 first > > $ ../du -h first > 1.0 MiB first/second > 1.4 MiB first > > $ ../du --si first > 1.1 MB first/second > 1.5 MB first > > $ ./du -h -s -c first a > 1.4 MiB first > 100 KiB a > 1.5 MiB total > > $ ./du -a -h first > 200 KiB first/b > 300 KiB first/c > 1.0 MiB first/second/d > 1.0 MiB first/second > 1.4 MiB first > > $ mkdir fourth && chmod -r fourth > $ ../du fourth first > error: fourth: getDirectoryContents: permission denied (Permission denied) > 1024 first/second > 1524 first/ > > $ ./du fifth first > error: fifth: openFile: does not exist (No such file or directory) > 1024 first/second > 1524 first > > $ ../du -d1 > error: ./fourth: getDirectoryContents: permission denied (Permission denied) > 0 ./third > 1524 ./first > 1624 . > > $ ../du --max-depth=1 --human-readable > error: ./fourth: getDirectoryContents: permission denied (Permission denied) > 0.0 B ./third > 1.4 MiB ./first > 1.5 MiB -} -- Name: Dominik Kolar -- UID: 433481 -- Used packages: base, directory, filepath module Main ( main ) where import Control.Exception (handle, IOException) import Control.Monad (when) import Data.Maybe (fromMaybe) import Data.Monoid (Last (..)) import Numeric (showFFloat) import System.Directory (doesFileExist, getDirectoryContents) import System.Environment (getArgs) import System.FilePath ((), pathSeparator) import System.IO (stderr, IOMode (ReadMode), hFileSize, hPutStrLn, withFile) -- OnlyDirectories is the default and has no corresponding cmd argument data WhatToPrint = FilesAndDirectories | OnlyDirectories | OnlyArguments deriving (Eq, Show) instance Monoid WhatToPrint where mempty = OnlyDirectories x `mappend` OnlyDirectories = x _ `mappend` y = y -- KiBdisplayNoUnit is the default and has no corresponding cmd argument data PrintFormat = KiBdisplayNoUnit | PowerOfTen | PowerOfTwo deriving (Eq, Show) instance Monoid PrintFormat where mempty = KiBdisplayNoUnit x `mappend` KiBdisplayNoUnit = x _ `mappend` y = y data Cfg = Cfg { whatToPrint :: WhatToPrint, printFormat :: PrintFormat, printTotal :: Last (), helpAndExit :: Last (), filesAndDirs :: [FilePath] } deriving (Eq, Show) instance Monoid Cfg where mempty = Cfg mempty mempty mempty mempty mempty x `mappend` y = Cfg { whatToPrint = mappend (whatToPrint x) (whatToPrint y), printFormat = mappend (printFormat x) (printFormat y), printTotal = mappend (printTotal x) (printTotal y), helpAndExit = mappend (helpAndExit x) (helpAndExit y), filesAndDirs = mappend (filesAndDirs x) (filesAndDirs y) } data Option = Option { short :: Maybe Char, long :: Maybe String, description :: String, config :: Cfg } deriving (Eq, Show) optionTable :: [Option] optionTable = [ Option (Just 'a') (Just "all") "write counts for all files, not just directories" (mempty {whatToPrint = FilesAndDirectories}), Option Nothing (Just "si") "like -h, but use powers of 1000 not 1024 (e.g., 1KB, 245MB, 2.1GB)" (mempty {printFormat = PowerOfTen}), Option (Just 'h') (Just "human-readable") "print sizes in human readable format (e.g., 1KiB 234MiB 2GiB)" (mempty {printFormat = PowerOfTwo}), Option (Just 's') (Just "summarize") "display only a total for each argument" (mempty {whatToPrint = OnlyArguments}), Option (Just 'c') (Just "total") "produce a grand total" (mempty {printTotal = Last (Just ())}), Option Nothing (Just "help") "display help and exit" (mempty {helpAndExit = Last (Just ())}) ] optTableToString' :: Int -> [Option] -> String optTableToString' _ [] = [] optTableToString' lg (x:xs) = shrt (short x) ++ lng (long x) ++ description x ++ "\n" ++ optTableToString' lg xs where shrt Nothing = " " shrt (Just c) = "-" ++ [c] ++ ", " lng Nothing = take (lg + 2 + 2) (cycle " ") lng (Just str) = "--" ++ str ++ take (lg - length str + 2) (cycle " ") optTableToString :: String optTableToString = optTableToStringGetLength 0 optionTable where optTableToStringGetLength lg [] = optTableToString' lg optionTable optTableToStringGetLength lg (x:xs) = optTableToStringGetLength (max lg ((length . fromMaybe "" . long) x)) xs argsToConfig' :: Cfg -> [String] -> Cfg argsToConfig' pCfg [] = pCfg argsToConfig' pCfg (x:xs) | take 1 x == "-" = nextCall (lookInOptTable x) | otherwise = nextCall (mempty {filesAndDirs = [x]}) where nextCall cCfg = argsToConfig' (mappend pCfg cCfg) xs lookInOptTable str = lookInOptTable' str optionTable lookInOptTable' str [] = mempty {filesAndDirs = [str]} lookInOptTable' str (y:ys) = if cond str y then config y else lookInOptTable' str ys cond str y | take 2 str == "--" = long y == Just (drop 2 str) | otherwise = maybeCharToMaybeString (short y) == Just (drop 1 str) maybeCharToMaybeString Nothing = Nothing maybeCharToMaybeString (Just char) = Just [char] argsToConfig :: [String] -> Cfg argsToConfig = argsToConfig' mempty printSize :: Cfg -> Integer -> String -> IO () printSize cfg size str | printFormat cfg == KiBdisplayNoUnit = customPrint (_BtoKiB (fromIntegral size)) | printFormat cfg == PowerOfTen = customPrint (_BtoSI 0 (fromIntegral size)) | printFormat cfg == PowerOfTwo = customPrint (_BtoH 0 (fromIntegral size)) | otherwise = error "unknown format" where customPrint numToStr = putStrLn $ numToStr ++ take (17 - length numToStr) (cycle " ") ++ str _HStr = ["KiB", "MiB", "GiB", "TiB", "PiB"] _BtoH index num | index == length _HStr = error "you're way too big, bud" | otherwise = ((\n -> if ceiling n > (1023 :: Integer) then _BtoH (index+1) n else oneDecOrRound n ++ " " ++ (_HStr !! index) ) . (/1024)) num _SIStr = ["KB", "MB", "GB", "TB", "PB"] _BtoSI index num | index == length _SIStr = error "you're way too big, bud" | otherwise = ((\n -> if ceiling n > (999 :: Integer) then _BtoSI (index+1) n else oneDecOrRound n ++ " " ++ (_SIStr !! index) ) . (/1000)) num _BtoKiB = oneDecOrRound . (/1024) oneDecOrRound :: Double -> String oneDecOrRound realN = if realN < 10 && realN /= 0 then showFFloat (Just 1) realN "" else show (ceiling realN :: Integer) getSizeUniversal :: Cfg -> Bool -> Bool -> FilePath -> IO (Maybe Integer) getSizeUniversal cfg printDir printFile fileName = handle errorHandler $ do isFile <- doesFileExist fileName if isFile then getFileSize cfg fileName printFile else getDirectorySize cfg fileName printDir where errorHandler :: IOException -> IO (Maybe Integer) errorHandler ex = hPutStrLn stderr ("ERROR: " ++ show ex) >> return Nothing getFileSize :: Cfg -> FilePath -> Bool -> IO (Maybe Integer) getFileSize cfg fileName bPrint = do size <- withFile fileName ReadMode hFileSize when bPrint $ printSize cfg size fileName return (Just size) getDirectorySize :: Cfg -> FilePath -> Bool -> IO (Maybe Integer) getDirectorySize cfg filePath bPrint = do dirContents <- fmap (filePath ) . filter notCurAndSubDir <$> getDirectoryContents filePath fileSizes <- mapM (getSizeUniversal cfg (printDirs (whatToPrint cfg)) (printFiles (whatToPrint cfg))) dirContents let sizeSum = foldr ((+) . fromMaybe 0) 0 fileSizes when bPrint $ printSize cfg sizeSum filePath return (Just sizeSum) where notCurAndSubDir "." = False notCurAndSubDir ".." = False notCurAndSubDir _ = True printFiles FilesAndDirectories = True printFiles _ = False printDirs OnlyArguments = False printDirs _ = True main :: IO () main = getArgs >>= checkCfg . argsToConfig where checkCfg cfg | isTrue (helpAndExit cfg) = putStrLn "usage: du [options] [files]" >> putStr optTableToString | otherwise = do let files = ((\fd -> if null fd then [['.', pathSeparator]] else fd) . filesAndDirs) cfg sizes <- mapM (getSizeUniversal cfg True True) files let size = foldr ((+) . fromMaybe 0) 0 sizes when (isTrue (printTotal cfg)) (printSize cfg size "total") isTrue :: Last () -> Bool isTrue (Last Nothing) = False isTrue _ = True