Eric Bergman-Terrell's Blog

Python Script to Audit MediaMonkey Transcoding
August 15, 2019

I have a large collection of audio CDs. 1,587 to be exact. I rip them to .FLAC format using Exact Audio Copy. I use MediaMonkey to convert them to .MP3 format so that they'll fit on my phone. I use my EBT Music Player to play them on the phone.

With such a large number of CDs and tracks, I wondered if every single track was making it all the way to my phone. I noticed that EBT Music Player was reporting 20 fewer tracks than MediaMonkey.

Unfortunately I had no idea which of the over 20,000 tracks were not being transcoded to .MP3 format for the phone. Fortunately it was easy to write a Python script, using the mutagen module, to identify the missing tracks.

The script simply scans all the original .flac and .mp3 files on my PC, extracts the album name, and counts the tracks for each album. Then it scans the transcoded .mp3 files, and compares the results. Any albums found in the original files, that either don't exist among the transcoded files, or have a short count, are reported. I'm running the script using Python version 3.7. I used PyCharm to write and debug the script. If you use this script to audit your files, you should only need to change the folder paths in the main() function at the bottom.

TranscodingAudit.py:

import os
import fnmatch
import mutagen


ALBUM_KEY = 'ALBUM'
WILDCARD_KEY = 'WILDCARD'


def extract_metadata(path, extraction_configs):
    counts = dict()

    for path, dirs, files in os.walk(path):
        for extraction_config in extraction_configs:
            matching_files = fnmatch.filter(files, extraction_config[WILDCARD_KEY])
            count = len(matching_files)

            if count > 0:
                for media_file in matching_files:
                    media_file_path = '{}{}{}'.format(path, os.sep, media_file)
                    metadata = mutagen.File(media_file_path)

                    album = (metadata.tags[extraction_config[ALBUM_KEY]][0]).strip()

                    counts[album] = counts.get(album, 0) + 1

    return counts


def audit(src_counts, dest_counts):
    matches = 0
    not_found = 0

    for album in src_counts:
        if album in dest_counts:
            src_count = src_counts[album]
            dest_count = dest_counts[album]

            if src_count == dest_count:
                matches += 1
            else:
                print('ALBUM {} INCORRECT COUNT Source Count: {} Dest Count: {}'.format(album, src_count, dest_count))
        else:
            not_found += 1
            print('ALBUM NOT FOUND IN DESTINATION: {}'.format(album))

    print('Total Albums: {} Matches: {} Mis-Matches: {} Not Found: {}'
          .format(len(src_counts), matches, len(src_counts) - matches, not_found))


def main():
    mp3_config = dict()
    mp3_config[ALBUM_KEY] = 'TALB'
    mp3_config[WILDCARD_KEY] = '*.mp3'

    flac_config = dict()
    flac_config[ALBUM_KEY] = 'ALBUM'
    flac_config[WILDCARD_KEY] = '*.flac'

    src_counts = extract_metadata('D:\\Media\\Music', [flac_config, mp3_config])
    dest_counts = extract_metadata('H:\\Sync\\MP3s', [mp3_config])

    audit(src_counts, dest_counts)


main()

The script solved the mystery. The missing 20 tracks were from two CDs for which I had forgotten to create a playlist:

Keywords: MediaMonkey, Python, EBT Music Player, EAC, Exact Audio Copy, FLAC, MP3, Polka, PyCharm, Frankie Yancovic, mutagen, transcoding

Reader Comments

Comment on this Blog Post

Recent Posts

TitleDate
Python Script to Audit MediaMonkey TranscodingAugust 15, 2019
How to decompile Java code with JetBrains IntelliJ IDEA (2018.2.3, Windows 10)October 5, 2018
Java Programming Tip: SWT Photo Frame ProgramOctober 31, 2016
Vault 3 (Desktop) Version 1.63 ReleasedSeptember 9, 2016
"Compliance with Court Orders Act of 2016"April 9, 2016
Disable "Visual Voicemail" on Android / T-MobileJanuary 17, 2016
IPv6 HumorDecember 10, 2015