Advice needed for best way to organize my media and remove duplicates

Any questions? Need some help?
Post Reply
obi
Posts: 8
Joined: 16 Jan 2015, 06:19

Advice needed for best way to organize my media and remove duplicates

Post by obi »

I have a lot of media spread over various hdd's, a lot of it is duplicated. I want to consolidate it all, rename it properly, and get metadata using filebot.

For the metadata/info/subtitles, my plan is to use the AMC script, which looks like the most powerful solution out there.

But before that, I wish to remove duplicates. This will involve -

for movies:
- finding duplicate versions of a movie, and choosing the one with the highest bitrates and audio channels, and giving preference to hevc/x265 over h264 for instance
- I don't know if there is a way to extended edition/directors cuts, usually they have higher runtime
- for movies split into multiple parts (cd1, cd2) I'd like to be able to treat them as a single entity
- if there are special features/extras, then these should be preserved

for tv shows:
- same as above, but some folders may have only some seasons/episodes, so that needs to be accounted for
- and then pick the version of the episode/season with highest quality etc

I thought this needed to be done by writing my own script but its a daunting task and I don't know if I'm up for it. Then I thought I should use existing tools which do a lot of the work, such as filebot.

I also found this tool - https://github.com/l3uddz/plex_dupefinder, which seems quite promising. But I don't know if it will handle tv seasons, has anyone used it?

What I want to avoid is to query the online sources like tvdb, tmdb multiple times - to make things faster and to also be a good citizen and avoid load on these sites. e.g. if I add my media sources to Plex. All I really want to do is to identify the movie/tv show.

at this point I am thinking of 2 options -

1. add all the media to Plex. I see no way to avoid it getting cover art, metadata etc. Then use plex_dupefinder to delete duplicates after checking each manually

2. use filebot's --rename feature with a simple format like {plex}, this will hopefully just rename all files/folders. Does this also generate nfo files and does that contain bitrate/codec info from running ffprobe? after this I will have to manually examine duplicates, possible write a script to read the bitrate from nfo etc. I think this ends up doing most of the work from plex_dupefinder, except it will have more control, but needs to be written and tested, which is the real blocker.

I'm sure this is not an isolated problem, there are lots of people with much bugger media collections. What is the best way to do this?
User avatar
rednoah
The Source
Posts: 24009
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Advice needed for best way to organize my media and remove duplicates

Post by rednoah »

obi wrote: 27 Jun 2024, 21:10 For the metadata/info/subtitles, my plan is to use the AMC script, which looks like the most powerful solution out there.
You'll definitely want to use the FileBot Desktop application, especially if it's a one-off task, especially if you're using FileBot for the first time. There's nothing you cannot do in the FileBot Desktop application just as well and better.



obi wrote: 27 Jun 2024, 21:10 But before that, I wish to remove duplicates.
The Find Duplicate Movie or Episode Files script requires you to organize files first, and then delete duplicates (based on xattr metadata) in a second step.



obi wrote: 27 Jun 2024, 21:10 add all the media to Plex. I see no way to avoid it getting cover art, metadata etc
You indeed want to just let Plex independently do it's things. You do want to follow the How do I organize files for Plex? guide though (with a custom format that preserves video quality, edition, etc to allow for duplicates / versions in the initial step) because the {tmdb-12345} ID marker that {plex.id} adds to the file path is extremely important for reliability (ID lookup instead of search) and speed (as search is skipped entirely).

:idea: As for artwork, Plex will want to do it's own thing so I'd recommend letting Plex do it's own thing there for the sake of simplicity. FileBot can generate NFO file and artwork files for you, but if you want Plex to actually use those files, then that's a topic for the Plex forums.







:idea: General Advice: you'll want to run tests on a representative small set of files first. You indeed want to use hardlinks so you can have your old structure untouched as you build a new one, without using disk space, with the ability just delete the new structure and start anew at any time.
:idea: Please read the FAQ and How to Request Help.
Post Reply