I have a large collection of audio CDs. 1,587 to be exact. I rip them to .FLAC format using Exact Audio Copy. I use MediaMonkey to convert them to .MP3 format so that they'll fit on my phone. I use my EBT Music Player to play them on the phone.
With such a large number of CDs and tracks, I wondered if every single track was making it all the way to my phone. I noticed that EBT Music Player was reporting 20 fewer tracks than MediaMonkey.
Unfortunately I had no idea which of the over 20,000 tracks were not being transcoded to .MP3 format for the phone. Fortunately it was easy to write a Python script, using the mutagen module, to identify the missing tracks.
The script simply scans all the original .flac and .mp3 files on my PC, extracts the album name, and counts the tracks for each album. Then it scans the transcoded .mp3 files, and compares the results. Any albums found in the original files, that either don't exist among the transcoded files, or have a short count, are reported. I'm running the script using Python version 3.7. I used PyCharm to write and debug the script. If you use this script to audit your files, you should only need to change the folder paths in the main() function at the bottom.
TranscodingAudit.py:
import os import fnmatch import mutagen ALBUM_KEY = 'ALBUM' WILDCARD_KEY = 'WILDCARD' def extract_metadata(path, extraction_configs): counts = dict() for path, dirs, files in os.walk(path): for extraction_config in extraction_configs: matching_files = fnmatch.filter(files, extraction_config[WILDCARD_KEY]) count = len(matching_files) if count > 0: for media_file in matching_files: media_file_path = '{}{}{}'.format(path, os.sep, media_file) metadata = mutagen.File(media_file_path) album = (metadata.tags[extraction_config[ALBUM_KEY]][0]).strip() counts[album] = counts.get(album, 0) + 1 return counts def audit(src_counts, dest_counts): matches = 0 not_found = 0 for album in src_counts: if album in dest_counts: src_count = src_counts[album] dest_count = dest_counts[album] if src_count == dest_count: matches += 1 else: print('ALBUM {} INCORRECT COUNT Source Count: {} Dest Count: {}'.format(album, src_count, dest_count)) else: not_found += 1 print('ALBUM NOT FOUND IN DESTINATION: {}'.format(album)) print('Total Albums: {} Matches: {} Mis-Matches: {} Not Found: {}' .format(len(src_counts), matches, len(src_counts) - matches, not_found)) def main(): mp3_config = dict() mp3_config[ALBUM_KEY] = 'TALB' mp3_config[WILDCARD_KEY] = '*.mp3' flac_config = dict() flac_config[ALBUM_KEY] = 'ALBUM' flac_config[WILDCARD_KEY] = '*.flac' src_counts = extract_metadata('D:\\Media\\Music', [flac_config, mp3_config]) dest_counts = extract_metadata('H:\\Sync\\MP3s', [mp3_config]) audit(src_counts, dest_counts) main()
The script solved the mystery. The missing 20 tracks were from two CDs for which I had forgotten to create a playlist:
Title | Date |
.NET Public-Key (Asymmetric) Cryptography Demo | July 20, 2025 |
Raspberry Pi 3B+ Photo Frame | June 17, 2025 |
EBTCalc (Android) Version 1.53 is now available | May 19, 2024 |
Vault 3 Security Enhancements | October 24, 2023 |
Vault 3 is now available for Apple OSX M2 Mac Computers! | September 18, 2023 |
Vault (for Desktop) Version 0.77 Released | March 26, 2023 |
EBTCalc (Android) Version 1.44 is now available | October 12, 2021 |