GuessIt

Release v0.9.0 (Installation)

GuessIt is a python library that extracts as much information as possible from a video file.

It has a very powerful filename matcher that allows to guess a lot of metadata from a video using its filename only. This matcher works with both movies and tv shows episodes.

For example, GuessIt can do the following:

$ guessit "Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi"
For: Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi
GuessIt found: {
    [1.00] "mimetype": "video/x-msvideo",
    [0.80] "episodeNumber": 3,
    [0.80] "videoCodec": "XviD",
    [1.00] "container": "avi",
    [1.00] "format": "HDTV",
    [0.70] "series": "Treme",
    [0.50] "title": "Right Place, Wrong Time",
    [0.80] "releaseGroup": "NoTV",
    [0.80] "season": 1,
    [1.00] "type": "episode"
}

Filename matcher

The filename matcher is based on regular expressions and tree splitting to guess values from input filename.

It is able to find many properties, like title, year, series, episodeNumber, seasonNumber, videoCodec, screenSize, language. Guessed values are cleaned up and given in a readable format which may not match the raw filename.

DVDSCR will be guessed as format = DVD + other = Screener.

1920x1080 will be guessed as screenSize = 1080p.

DD5.1 will be guessed as audioCodec = DolbyDigital + audioChannel = 5.1.

Here’s the exhaustive list of properties that guessit can find:

Main properties

  • type

    Type of the file.

    • unknown, movie, episode, moviesubtitle, episodesubtitle
  • title

    Title of movie or episode.

  • container

    Container of the file.

    • 3g2, wmv, webm, mp4, avi, mp4a, mpeg, sub, mka, m4v, ts, mkv, ra, rm, wma, ass, mpg, ram, 3gp, ogv, mov, ogm, asf, divx, ogg, ssa, qt, idx, nfo, wav, flv, 3gp2, iso, mk2, srt
  • date

    Date found in filename.

  • year

    Year of movie (or episode).

  • releaseGroup

    Name of (non)scene group that released the file.

  • website

    Name of website contained in the filename.

Episode properties

  • series

    Name of series.

  • season

    Season number.

  • episodeNumber

    Episode number.

  • episodeList

    List of episode numbers if several were found.

    • note: If several are found, episodeNumber is the first item of this list.
  • seasonList

    List of season numbers if several were found.

    • note: If several are found, seasonNumber is the first item of this list.
  • episodeCount

    Total number of episodes.

  • seasonCount

    Total number of seasons.

  • episodeDetails

    Some details about the episode.

    • Bonus Oav Ova Omake Extras Unaired Special Pilot
  • episodeFormat

    Episode format of the series.

    • Minisode
  • part

    Part number of the episode.

  • version

    Version of the episode.

    • In anime fansub scene, new versions are released with tag <episode>v[0-9].

Video properties

  • format

    Format of the initial source

    • HDTV WEB-DL TV VOD BluRay DVD WEBRip Workprint Telecine VHS DVB Telesync HD-DVD PPV Cam
  • screenSize

    Resolution of video. - 720p 1080p 1080i <width>x<height> 4K 360p 368p 480p 576p 900p

  • videoCodec Codec used for video.

    • h264 h265 DivX XviD Real Mpeg2
  • videoProfile Codec profile used for video.

    • 8bit 10bit HP BP MP XP Hi422P Hi444PP
  • videoApi API used for the video.

    • DXVA

Audio properties

  • audioChannels

    Number of channels for audio.

    • 1.0 2.0 5.1 7.1
  • audioCodec Codec used for audio.

    • DTS TrueHD DolbyDigital AAC AC3 MP3 Flac
  • audioProfile The codec profile used for audio.

    • LC HQ HD HE HDMA

Localization properties

  • Country

    Country(ies) of content. Often found in series, Shameless (US) for instance.

    • [<babelfish.Country>] (This class equals name and iso code)
  • Language

    Language(s) of the audio soundtrack.

    • [<babelfish.Language>] (This class equals name and iso code)
  • subtitleLanguage

    Language(s) of the subtitles.

    • [<babelfish.Language>] (This class equals name and iso code)

Other properties

  • bonusNumber

    Bonus number.

  • bonusTitle

    Bonus title.

  • cdNumber

    CD number.

  • cdNumberTotal

    Total number of CD.

  • crc32

    CRC32 of the file.

  • idNumber

    Volume identifier (UUID).

  • edition

    Edition of the movie.

    • Special Edition, Collector Edition, Director's cut, Criterion Edition, Deluxe Edition
  • filmNumber

    Film number of this movie.

  • filmSeries

    Film series of this movie.

  • other

    Other property will appear under this property.

    • Fansub, HR, HQ, Netflix, Screener, Unrated, HD, 3D, SyncFix, Bonus, WideScreen, Fastsub, R5, AudioFix, DDC, Trailer, Complete, Limited, Classic, Proper, DualAudio, LiNE

Other features

GuessIt also allows you to compute a whole lof of hashes from a file, namely all the ones you can find in the hashlib python module (md5, sha1, ...), but also the Media Player Classic hash that is used (amongst others) by OpenSubtitles and SMPlayer, as well as the ed2k hash.

If you have the ‘guess-language’ python package installed, GuessIt can also analyze a subtitle file’s contents and detect which language it is written in.

If you have the ‘enzyme’ python package installed, GuessIt can also detect the properties from the actual video file metadata.

Usage

guessit can be use from command line:

$ guessit

Usage: guessit [options] file1 [file2...]

Options:
  -h, --help            show this help message and exit
  -P SHOW_PROPERTY, --show-property=SHOW_PROPERTY
                        Display the value of a single property (title, series,
                        videoCodec, year, type ...)

      Naming:
        -t TYPE, --type=TYPE
                            The suggested file type: movie, episode. If undefined,
                            type will be guessed.
        -n, --name-only     Parse files as name only. Disable folder parsing,
                            extension parsing, and file content analysis.
        -c, --split-camel   Split camel case part of filename.
        -Y, --date-year-first
                            If short date is found, consider the first digits as
                            the year.
        -D, --date-day-first
                            If short date is found, consider the second digits as
                            the day.
        -E, --episode-prefer-number
                            Guess "serie.213.avi" as the episodeNumber 213.
                            Without this option, it will be guessed as season 2,
                            episodeNumber 13
        -L ALLOWED_LANGUAGES, --allowed-languages=ALLOWED_LANGUAGES
                            List of allowed languages. Separate languages codes
                            with ";"
        -C ALLOWED_COUNTRIES, --allowed-countries=ALLOWED_COUNTRIES
                            List of allowed countries. Separate country codes with
                            ";"
        -S EXPECTED_SERIES, --expected-series=EXPECTED_SERIES
                            List of expected series to parse. Separate series
                            names with ";"
        -T EXPECTED_TITLE, --expected-title=EXPECTED_TITLE
                            List of expected titles to parse. Separate title names
                            with ";"
        -G EXPECTED_GROUP, --expected-group=EXPECTED_GROUP
                            List of expected groups to parse. Separate group names
                            with ";"
        --disabled-transformers=DISABLED_TRANSFORMERS
                            List of transformers to disable. Separate transformers
                            names with ";"

      Output:
        -v, --verbose       Display debug output
        -a, --advanced      Display advanced information for filename guesses, as
                            json output
        -y, --yaml          Display information for filename guesses as yaml
                            output (like unit-test)
        -f INPUT_FILE, --input-file=INPUT_FILE
                            Read filenames from an input file.
        -d, --demo          Run a few builtin tests instead of analyzing a file

      Information:
        -p, --properties    Display properties that can be guessed.
        -V, --values        Display property values that can be guessed.
        -s, --transformers  Display transformers that can be used.

      guessit.io:
        -b, --bug           Submit a wrong detection to the guessit.io service

      Other features:
        -i INFO, --info=INFO
                            The desired information type: filename, video,
                            hash_mpc or a hash from python's hashlib module, such
                            as hash_md5, hash_sha1, ...; or a list of any of them,
                            comma-separated

It can also be used as a python module:

>>> from guessit import guess_file_info
>>> guess_file_info('Treme.1x03.Right.Place,.Wrong.Time.HDTV.XviD-NoTV.avi')
{u'mimetype': 'video/x-msvideo', u'episodeNumber': 3, u'videoCodec': u'XviD', u'container': u'avi', u'format':     u'HDTV', u'series': u'Treme', u'title': u'Right Place, Wrong Time', u'releaseGroup': u'NoTV', u'season': 1, u'type': u'episode'}

User Guide

This part of the documentation, which is mostly prose, shows how to use Guessit both from the command-line and as a python module which you can use in your own projects.

Web Service API

The guessit.io server also provides a free webservice that allows you to perform filename detection, even you don’t have python installed (eg: you need to use it from an Android app, or NodeJS, etc.).

You can look at the documentation for the web API here: http://api.guessit.io

Support

The project website for GuessIt is hosted at ReadTheDocs. There you will also find the User guide and Developer documentation.

This project is hosted on GitHub: https://github.com/wackou/guessit

Please report issues and/or feature requests via the bug tracker.

You can also report issues using the command-line tool:

$ guessit --bug "filename.that.fails.avi"

Contribute

GuessIt is under active development, and contributions are more than welcome!

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
  2. Fork the repository on Github to start making your changes to the master branch (or branch off of it).
  3. Write a test which shows that the bug was fixed or that the feature works as expected.
  4. Send a pull request and bug the maintainer until it gets merged and published. :)

License

GuessIt is licensed under the LGPLv3 license.

Build status License

Travis-CI Code coverage

GuessIt is a python library that tries to extract and/or guess as much information as possible from a file.

Useful Links

Donate

If you like GuessIt, please consider making a donation in bitcoins or in litecoins.