检测具有不同比特率和/或不同 ID3 标签的重复 MP3 文件?

2022-01-10 00:00:00 python file duplicates mp3 id3

问题描述

我如何检测(最好使用 Python)可以使用不同比特率(但它们是同一首歌曲)编码的重复 MP3 文件和可能不正确的 ID3 标签?

How could I detect (preferably with Python) duplicate MP3 files that can be encoded with different bitrates (but they are the same song) and ID3 tags that can be incorrect?

我知道我可以对文件内容进行 MD5 校验和,但这不起作用对于不同的比特率.而且我不知道 ID3 标签是否会影响生成 MD5 校验和.我应该重新编码具有不同比特率的 MP3 文件,然后我可以进行校验和吗?你有什么推荐的?

I know I can do an MD5 checksum of the files content but that won't work for different bitrates. And I don't know if ID3 tags have influence in generating the MD5 checksum. Should I re-encode MP3 files that have a different bitrate and then I can do the checksum? What do you recommend?


解决方案

老AudioScrobbler 和目前 的人完全相同的问题MusicBrainz 很久以前就开始工作了.目前,可以帮助您完成任务的 Python 项目是 Picard,它将标记音频文件(不仅是 MPEG 1 Layer 3 文件)带有 GUID(实际上是几个),从那时起,匹配标签就非常简单了.

The exact same question that people at the old AudioScrobbler and currently at MusicBrainz have worked on since long ago. For the time being, the Python project that can aid in your quest, is Picard, which will tag audio files (not only MPEG 1 Layer 3 files) with a GUID (actually, several of them), and from then on, matching the tags is quite simple.

如果您更愿意将其作为自己的项目,libofa 可能有帮助.

If you prefer to do it as a project of your own, libofa might be of help.

相关文章