Safe HaskellSafe

Main

Description

Third assignment for IB016, semester spring 2018.

Task overview

You certainly know the struggle: you rip a purchased DVD with your favourite TV series for watching on your computer, or you download three hundred episodes of a Creative Commons licensed soap opera, and suddenly your downloads folder is a mess of ugly filenames with different episode numbering schemes.

Your task is to implement a command line utility which tidies up the chaos and adds the files into your carefully organised video collection. You should create a standalone executable. This time, only the outer specification is given; you have to do the functional decomposition yourself.

Specification

The utility should be invoked like this:

./tidyvids [OPTIONS] TITLE SRCDIR VIDEODIR

This will recursively search SRCDIR for files belonging to a series called TITLE and adds them to your video collection located at VIDEODIR (more specifically, to something like VIDEODIR/TITLE/Series NN/).

Options

The following options should be supported, some of them require an argument.

  • -h, -?, --help – Prints short usage info and documentation of the options, other options are ignored and nothing gets moved.
  • -v, --verbose – Report (on standard output) what is being done (i.e. which file is moved where and what is its new name).
  • -n, --dry-run – Do not move anything, just report it like -v does.
  • -l FILE, --list-file=FILE – Specifies a file to read episode names from (see below).
  • -f FMT, --format=FMT – Specifies a format of episode numbering. Available formats are:

    • x01x03 for the third episode of the first season,
    • ss01e03, this is the default if no valid -f option is given,
    • SS01E03.
  • --sanitize – Enables filename sanitization for use in FAT and NTFS partitions. This is the default behaviour. The sanitization is described below.
  • --no-sanitize – Disables filename sanitization.

You may assume that at most one of each options are given, with the exception of --sanitize and --no-sanitize, where the last one applies. The rationale is that you might want to alias tidyvids to tidyvids --no-sanitize on a Linux system, but than you would have a problem if you wanted to move a video to an external FAT drive.

You can ignore invalid options or options with invalid arguments, or make the program fail gracefully. You are free to implement additional options, if you want to. You can assume that the options always come before the other three parameters. You can assume TITLE, SRCDIR, and VIDEODIR do not begin with -.

Behaviour

Finding files

The utility searches SRCDIR for files belonging to the series. All subdirectories (but not symbolic links to them) are searched recursively. You need to come up with some heuristic to decide whether a file belongs to the series. It does not have to be too elaborate; for example, for series "The Foo Bar" it should match the.foo.bar.1x23.FullHD.mkv or TFB/Season3/TheFooBar01x23.avi but not necessarily Foo Bar, The - S01E23 (2018).mp4 or The Foo Bar/TFB 01x23.mkv.

Moreover, a file is only considered if it is a video or subtitle file and has a clear season and episode number. Again, the heuristic for both of these conditions are up to you, but a simple file extension detection is enough for the first one (.avi, .mkv, .mp4 and .srt are sufficient, but you are free to add more, if you want to) and we recommend regex matching for the second one. The utility should be able to recognize all the episode numbering formats described at the -f option, and also their one-digit variants (e.g. 01x03, but also 1x3 or 1x03 for first season, third episode). You can expect both of these numbers to be less than 100 (sorry, fans of Ulice), but they may be both zero (that is sometimes used for various specials).

Adding files to the collection

The matching files are then renamed and moved to the video collection, whose root is at VIDEODIR. The actual destination directory for each file is {VIDEODIR}/{sanitized TITLE}/Season {nr.}/, where TITLE is sanitized (see below) and season number has always at least two digits (e.g. 01). The filename changes to {sanitized TITLE} - {episode id}.{ext} or {sanitized TITLE} - {episode id} - {sanitized episode name}.{ext}, where {episode id} is created from the season and episode number obtained from the original filename and formatted according to the --format option and the filename extension is unchanged. The latter form is used when the episode has a name assigned from the --list-file file.

If such file already exists, the utility does not replace it, but informs the user about it (and leaves original file in place).

Any directories along the path that do not exist are created.

Note: Do not use file renaming functions for moving, as these may not work across filesystem boundaries. Use copy and delete instead.

Error handling

If an error occurs during either finding the source files or moving them to the collection (most likely due to insufficient permissions), the utility must carry on, informing the user about the inconvenience (error messages should be shown on the standard error output).

Filename sanitization

The filenames are always stripped of slashes (/) and null characters. Additionally, if sanitization is enabled, the following characters also must not appear in the filenames: ASCII control characters (codes 1 to 31 and 127), question mark (?), vertical line (|), less-than (<), greater-than (>), colon (:), asterisk (*), backslash (\) and quotation mark (").

List of episode names

The file provided as the argument to the --list-file option is searched for episode names. Each line in the file that matches the pattern {season}x{episode}{whitespace}{episode name} assigns the episode identified by {season} and {episode} its title {episode name}. The fields {season} and {episode} are always two-digit numbers, {whitespace} is one or more spaces or tabs and {episode name} spans till the end of the line. Other lines are silently ignored. The utility must sanitize the episode name according to its sanitization setting.

Module and package constraints

You can use any modules from any packages, but all used packaged (except base and directory) have to be noted in the header of this file next to your name and UID.

Tips and tricks

  • Think twice before you start writing the code. Doing a proper functional decomposition will save you a lot of work/refactoring. Think of the functions you'll need, write their type signatures and only then start programming.
  • You may use monoids for command line arguments processing, but you don't have to. (But you want to.)
  • It is advised not to reinvent command line arguments parsing; there are multiple modules to do that for you. One is even a part of the base package. Do the research yourself.
  • You probably cannot both complete the assignment without using regular expressions AND not scream in agony, so we recommend reading a Haskell regex tutorial and the Regular expressions Haskell Wiki page.
  • For the finding and moving files, functions from System.Directory may be handy.

Examples

In the following examples, the original file tree rooted at . looks like this:

  .
  ├── HDIH
  │   ├── eplist.txt
  │   ├── First Season
  │   │   ├── HDIH-s01e05.avi
  │   │   ├── how.do.i.haskell.s1e1.720p.[XMATOUS3]
  │   │   │   ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].avi
  │   │   │   ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].nfo
  │   │   │   ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].srt
  │   │   │   └── Thumbs.db
  │   │   ├── How Do I Haskell? 01x02.mkv
  │   │   ├── How-Do-I-Haskell-1x03-FullHD.mkv
  │   │   └── How Do I Haskell - 1x4 (the <*> operator).mkv
  │   ├── Second Season
  │   │   ├── HowDoIHaskell_S02E01_Lenses.avi
  │   │   ├── HowDoIHaskell_S02E01_Lenses.srt
  │   │   ├── HowDoIHaskell-02x02-cze.srt
  │   │   ├── How.Do.I.Haskell?.02x02.Monadic.Transformers.avi
  │   │   ├── How-Do-I-Haskell-02-03-Parsec.mp4
  │   │   └── How Do I Haskell - 2nd season, episode 03.srt
  │   └── 00x00 - How Do I Haskell? (Prolog, but not like the language).mkv
  ├── tidyvids
  ...

And the contents of ./HDIH/eplist.txt are:

  Episodes of "How Do I Haskell?"
  ===============================

  01x01   What are functions, anyway?
  todo: find names of 01x02 and 02x01.
  01x04   The <$> and <*> operators
  01x03   Finally "Hello, world" (I/O in Haskell)
  00x00   Prolog, but not like the language
  02x03   Parsec
  02x02   Monadic transformers

Example 1

  ./tidyvids 'How Do I Haskell?' . Videos

Resulting file tree:

  .
  ├── HDIH
  │   ├── eplist.txt
  │   ├── First Season
  │   │   ├── HDIH-s01e05.avi
  │   │   └── how.do.i.haskell.s1e1.720p.[XMATOUS3]
  │   │       ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].nfo
  │   │       └── Thumbs.db
  │   └── Second Season
  │       ├── How-Do-I-Haskell-02-03-Parsec.mp4
  │       └── How Do I Haskell - 2nd season, episode 03.srt
  ├── tidyvids
  ├── Videos
  │   └── How Do I Haskell
  │       ├── Season 00
  │       │   └── How Do I Haskell - s00e00.mkv
  │       ├── Season 01
  │       │   ├── How Do I Haskell - s01e01.avi
  │       │   ├── How Do I Haskell - s01e01.srt
  │       │   ├── How Do I Haskell - s01e02.mkv
  │       │   ├── How Do I Haskell - s01e03.mkv
  │       │   └── How Do I Haskell - s01e04.mkv
  │       └── Season 02
  │           ├── How Do I Haskell - s02e01.avi
  │           ├── How Do I Haskell - s02e01.srt
  │           ├── How Do I Haskell - s02e02.avi
  │           └── How Do I Haskell - s02e02.srt
  ...

Example 2

  ./tidyvids --list-file=HDIH/eplist.txt -fx -v 'How Do I Haskell?' HDIH Videos

Resulting file tree:

  .
  ├── HDIH
  │   ├── eplist.txt
  │   ├── First Season
  │   │   ├── HDIH-s01e05.avi
  │   │   └── how.do.i.haskell.s1e1.720p.[XMATOUS3]
  │   │       ├── how.do.i.haskell.s1e1.720p.[XMATOUS3].nfo
  │   │       └── Thumbs.db
  │   └── Second Season
  │       ├── How-Do-I-Haskell-02-03-Parsec.mp4
  │       └── How Do I Haskell - 2nd season, episode 03.srt
  ├── tidyvids
  ├── Videos
  │   └── How Do I Haskell
  │       ├── Season 00
  │       │   └── How Do I Haskell - 00x00 - Prolog, but not like the language.mkv
  │       ├── Season 01
  │       │   ├── How Do I Haskell - 01x01 - What are functions, anyway.avi
  │       │   ├── How Do I Haskell - 01x01 - What are functions, anyway.srt
  │       │   ├── How Do I Haskell - 01x02.mkv
  │       │   ├── How Do I Haskell - 01x03 - Finally Hello, world (IO in Haskell).mkv
  │       │   └── How Do I Haskell - 01x04 - The $ and  operators.mkv
  │       └── Season 02
  │           ├── How Do I Haskell - 02x01.avi
  │           ├── How Do I Haskell - 02x01.srt
  │           ├── How Do I Haskell - 02x02 - Monadic transformers.avi
  │           └── How Do I Haskell - 02x02 - Monadic transformers.srt
  ...

Something similar is printed to the standard output:

  Moving HDIH/Second Season/HowDoIHaskell_S02E01_Lenses.avi -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x01.avi
  Moving HDIH/Second Season/How.Do.I.Haskell?.02x02.Monadic.Transformers.avi -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x02 - Monadic transformers.avi
  Moving HDIH/Second Season/HowDoIHaskell_S02E01_Lenses.srt -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x01.srt
  Moving HDIH/Second Season/HowDoIHaskell-02x02-cze.srt -> Videos/How Do I Haskell/Season 02/How Do I Haskell - 02x02 - Monadic transformers.srt
  Moving HDIH/First Season/How Do I Haskell? 01x02.mkv -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x02.mkv
  Moving HDIH/First Season/How Do I Haskell - 1x4 (the <*> operator).mkv -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x04 - The $ and  operators.mkv
  Moving HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].avi -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x01 - What are functions, anyway.avi
  Moving HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].srt -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x01 - What are functions, anyway.srt
  Moving HDIH/First Season/How-Do-I-Haskell-1x03-FullHD.mkv -> Videos/How Do I Haskell/Season 01/How Do I Haskell - 01x03 - Finally Hello, world (IO in Haskell).mkv
  Moving HDIH/00x00 - How Do I Haskell? (Prolog, but not like the language).mkv -> Videos/How Do I Haskell/Season 00/How Do I Haskell - 00x00 - Prolog, but not like the language.mkv

Example 3

  ./tidyvids --format=S -l HDIH/eplist.txt -n --no-sanitize 'How Do I Haskell?' 'HDIH/First Season/' Videos

No files are changed, but something similar is printed to the standard output:

  Would move HDIH/First Season/How Do I Haskell? 01x02.mkv -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E02.mkv
  Would move HDIH/First Season/How Do I Haskell - 1x4 (the <*> operator).mkv -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E04 - The <$> and <*> operators.mkv
  Would move HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].avi -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E01 - What are functions, anyway?.avi
  Would move HDIH/First Season/how.do.i.haskell.s1e1.720p.[XMATOUS3]/how.do.i.haskell.s1e1.720p.[XMATOUS3].srt -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E01 - What are functions, anyway?.srt
  Would move HDIH/First Season/How-Do-I-Haskell-1x03-FullHD.mkv -> Videos/How Do I Haskell?/Season 01/How Do I Haskell? - S01E03 - Finally "Hello, world" (IO in Haskell).mkv

Synopsis

Documentation

main :: IO () #

The main function of the program.