Safe Haskell | Safe |
---|
Third assignment for IB016, semester spring 2018.
Task overview
You certainly know the struggle: you rip a purchased DVD with your favourite TV series for watching on your computer, or you download three hundred episodes of a Creative Commons licensed soap opera, and suddenly your downloads folder is a mess of ugly filenames with different episode numbering schemes.
Your task is to implement a command line utility which tidies up the chaos and adds the files into your carefully organised video collection. You should create a standalone executable. This time, only the outer specification is given; you have to do the functional decomposition yourself.
Specification
The utility should be invoked like this:
./tidyvids [OPTIONS] TITLE SRCDIR VIDEODIR
This will recursively search SRCDIR
for files belonging to a series called TITLE
and adds
them to your video collection located at VIDEODIR
(more specifically, to something like
VIDEODIR/TITLE/Series NN/
).
Options
The following options should be supported, some of them require an argument.
-h
,-?
,--help
– Prints short usage info and documentation of the options, other options are ignored and nothing gets moved.-v
,--verbose
– Report (on standard output) what is being done (i.e. which file is moved where and what is its new name).-n
,--dry-run
– Do not move anything, just report it like-v
does.-l FILE
,--list-file=FILE
– Specifies a file to read episode names from (see below).-f FMT
,--format=FMT
– Specifies a format of episode numbering. Available formats are:x
→01x03
for the third episode of the first season,s
→s01e03
, this is the default if no valid-f
option is given,S
→S01E03
.
--sanitize
– Enables filename sanitization for use in FAT and NTFS partitions. This is the default behaviour. The sanitization is described below.--no-sanitize
– Disables filename sanitization.
You may assume that at most one of each options are given, with the exception of --sanitize
and --no-sanitize
, where the last one applies. The rationale is that you might want to alias
tidyvids
to tidyvids --no-sanitize
on a Linux system, but than you would have a problem if
you wanted to move a video to an external FAT drive.
You can ignore invalid options or options with invalid arguments, or make the program fail
gracefully. You are free to implement additional options, if you want to. You can assume that
the options always come before the other three parameters. You can assume TITLE
, SRCDIR
, and
VIDEODIR
do not begin with -
.
Behaviour
Finding files
The utility searches SRCDIR
for files belonging to the series. All subdirectories (but not
symbolic links to them) are searched recursively. You need to come up with some
heuristic to decide whether a file belongs to the series. It does not have to be
too elaborate; for example, for series "The Foo Bar" it should match
the.foo.bar.1x23.FullHD.mkv
or TFB/Season3/TheFooBar01x23.avi
but not necessarily
Foo Bar, The - S01E23 (2018).mp4
or The Foo Bar/TFB 01x23.mkv
.
Moreover, a file is only considered if it is a video or subtitle file and has a
clear season and episode number. Again, the heuristic for both of these conditions are up to
you, but a simple file extension detection is enough for the first one (.avi
, .mkv
, .mp4
and .srt
are sufficient, but you are free to add more, if you want to) and we recommend regex
matching for the second one. The utility should be able to recognize all the episode numbering
formats described at the -f
option, and also their one-digit variants (e.g. 01x03
, but also
1x3
or 1x03
for first season, third episode). You can expect both of these numbers to be
less than 100 (sorry, fans of Ulice), but they may be both zero (that is sometimes used for
various specials).
Adding files to the collection
The matching files are then renamed and moved to the video collection, whose root is at
VIDEODIR
. The actual destination directory for each file is
{VIDEODIR}/{sanitized TITLE}/Season {nr.}/
, where TITLE is sanitized (see below) and
season number has always at least two digits (e.g. 01). The filename changes to {sanitized TITLE} -
{episode id}.{ext}
or {sanitized TITLE} - {episode id} - {sanitized episode name}.{ext}
,
where {episode id}
is created from the season and episode number obtained from the original
filename and formatted according to the --format
option and the filename extension is
unchanged. The latter form is used when the episode has a name assigned from the --list-file
file.
If such file already exists, the utility does not replace it, but informs the user about it (and leaves original file in place).
Any directories along the path that do not exist are created.
Note: Do not use file renaming functions for moving, as these may not work across filesystem boundaries. Use copy and delete instead.
Error handling
If an error occurs during either finding the source files or moving them to the collection (most likely due to insufficient permissions), the utility must carry on, informing the user about the inconvenience (error messages should be shown on the standard error output).
Filename sanitization
The filenames are always stripped of slashes (/
) and null characters. Additionally, if
sanitization is enabled, the following characters also must not appear in the filenames: ASCII
control characters (codes 1 to 31 and 127), question mark (?
), vertical line (|
), less-than
(<
), greater-than (>
), colon (:
), asterisk (*
), backslash (\
) and quotation mark
("
).
List of episode names
The file provided as the argument to the --list-file
option is searched for episode names.
Each line in the file that matches the pattern {season}x{episode}{whitespace}{episode name}
assigns the episode identified by {season}
and {episode}
its title {episode name}
. The
fields {season}
and {episode}
are always two-digit numbers, {whitespace}
is one or more
spaces or tabs and {episode name}
spans till the end of the line. Other lines are silently
ignored. The utility must sanitize the episode name according to its sanitization setting.
Module and package constraints
You can use any modules from any packages, but all used packaged (except base and directory) have to be noted in the header of this file next to your name and UID.
Tips and tricks
- Think twice before you start writing the code. Doing a proper functional decomposition will save you a lot of work/refactoring. Think of the functions you'll need, write their type signatures and only then start programming.
- You may use monoids for command line arguments processing, but you don't have to. (But you want to.)
- It is advised not to reinvent command line arguments parsing; there are multiple modules to
do that for you. One is even a part of the
base
package. Do the research yourself. - You probably cannot both complete the assignment without using regular expressions AND not scream in agony, so we recommend reading a Haskell regex tutorial and the Regular expressions Haskell Wiki page.
- For the finding and moving files, functions from
System.Directory
may be handy.
Examples
In the following examples, the original file tree rooted at .
looks like this:
. ├── HDIH │ ├── eplist.txt │ ├── First Season │ │ ├── HDIH-s01e05.avi │ │ ├── how.do.i.haskell.s1e1.720p.[XMATOUS3] │ │ │ ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].avi │ │ │ ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].nfo │ │ │ ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].srt │ │ │ └── Thumbs.db │ │ ├── How Do I Haskell? 01x02.mkv │ │ ├── How-Do-I-Haskell-1x03-FullHD.mkv │ │ └── How Do I Haskell - 1x4 (the <*> operator).mkv │ ├── Second Season │ │ ├── HowDoIHaskell_S02E01_Lenses.avi │ │ ├── HowDoIHaskell_S02E01_Lenses.srt │ │ ├── HowDoIHaskell-02x02-cze.srt │ │ ├── How.Do.I.Haskell?.02x02.Monadic.Transformers.avi │ │ ├── How-Do-I-Haskell-02-03-Parsec.mp4 │ │ └── How Do I Haskell - 2nd season, episode 03.srt │ └── 00x00 - How Do I Haskell? (Prolog, but not like the language).mkv ├── tidyvids ...
And the contents of ./HDIH/eplist.txt
are:
Episodes of "How Do I Haskell?" =============================== 01x01 What are functions, anyway? todo: find names of 01x02 and 02x01. 01x04 The <$> and <*> operators 01x03 Finally "Hello, world" (I/O in Haskell) 00x00 Prolog, but not like the language 02x03 Parsec 02x02 Monadic transformers
Example 1
./tidyvids 'How Do I Haskell?' . Videos
Resulting file tree:
. ├── HDIH │ ├── eplist.txt │ ├── First Season │ │ ├── HDIH-s01e05.avi │ │ └── how.do.i.haskell.s1e1.720p.[XMATOUS3] │ │ ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].nfo │ │ └── Thumbs.db │ └── Second Season │ ├── How-Do-I-Haskell-02-03-Parsec.mp4 │ └── How Do I Haskell - 2nd season, episode 03.srt ├── tidyvids ├── Videos │ └── How Do I Haskell │ ├── Season 00 │ │ └── How Do I Haskell - s00e00.mkv │ ├── Season 01 │ │ ├── How Do I Haskell - s01e01.avi │ │ ├── How Do I Haskell - s01e01.srt │ │ ├── How Do I Haskell - s01e02.mkv │ │ ├── How Do I Haskell - s01e03.mkv │ │ └── How Do I Haskell - s01e04.mkv │ └── Season 02 │ ├── How Do I Haskell - s02e01.avi │ ├── How Do I Haskell - s02e01.srt │ ├── How Do I Haskell - s02e02.avi │ └── How Do I Haskell - s02e02.srt ...
Example 2
./tidyvids --list-file=HDIH/eplist.txt -fx -v 'How Do I Haskell?' HDIH Videos
Resulting file tree:
. ├── HDIH │ ├── eplist.txt │ ├── First Season │ │ ├── HDIH-s01e05.avi │ │ └── how.do.i.haskell.s1e1.720p.[XMATOUS3] │ │ ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].nfo │ │ └── Thumbs.db │ └── Second Season │ ├── How-Do-I-Haskell-02-03-Parsec.mp4 │ └── How Do I Haskell - 2nd season, episode 03.srt ├── tidyvids ├── Videos │ └── How Do I Haskell │ ├── Season 00 │ │ └── How Do I Haskell - 00x00 - Prolog, but not like the language.mkv │ ├── Season 01 │ │ ├── How Do I Haskell - 01x01 - What are functions, anyway.avi │ │ ├── How Do I Haskell - 01x01 - What are functions, anyway.srt │ │ ├── How Do I Haskell - 01x02.mkv │ │ ├── How Do I Haskell - 01x03 - Finally Hello, world (IO in Haskell).mkv │ │ └── How Do I Haskell - 01x04 - The $ and operators.mkv │ └── Season 02 │ ├── How Do I Haskell - 02x01.avi │ ├── How Do I Haskell - 02x01.srt │ ├── How Do I Haskell - 02x02 - Monadic transformers.avi │ └── How Do I Haskell - 02x02 - Monadic transformers.srt ...
Something similar is printed to the standard output:
Moving HDIH/Second Season/HowDoIHaskell_S02E01_Lenses.avi -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x01.avi Moving HDIH/Second Season/How.Do.I.Haskell?.02x02.Monadic.Transformers.avi -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x02 - Monadic transformers.avi Moving HDIH/Second Season/HowDoIHaskell_S02E01_Lenses.srt -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x01.srt Moving HDIH/Second Season/HowDoIHaskell-02x02-cze.srt -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x02 - Monadic transformers.srt Moving HDIH/First Season/How Do I Haskell? 01x02.mkv -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x02.mkv Moving HDIH/First Season/How Do I Haskell - 1x4 (the <*> operator).mkv -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x04 - The $ and operators.mkv Moving HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].avi -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x01 - What are functions, anyway.avi Moving HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].srt -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x01 - What are functions, anyway.srt Moving HDIH/First Season/How-Do-I-Haskell-1x03-FullHD.mkv -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x03 - Finally Hello, world (IO in Haskell).mkv Moving HDIH/00x00 - How Do I Haskell? (Prolog, but not like the language).mkv -> Videos/How Do I Haskell/Season 00/How Do I Haskell - 00x00 - Prolog, but not like the language.mkv
Example 3
./tidyvids --format=S -l HDIH/eplist.txt -n --no-sanitize 'How Do I Haskell?' 'HDIH/First Season/' Videos
No files are changed, but something similar is printed to the standard output:
Would move HDIH/First Season/How Do I Haskell? 01x02.mkv -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E02.mkv Would move HDIH/First Season/How Do I Haskell - 1x4 (the <*> operator).mkv -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E04 - The <$> and <*> operators.mkv Would move HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].avi -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E01 - What are functions, anyway?.avi Would move HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].srt -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E01 - What are functions, anyway?.srt Would move HDIH/First Season/How-Do-I-Haskell-1x03-FullHD.mkv -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E03 - Finally "Hello, world" (IO in Haskell).mkv
- main :: IO ()