Find Duplicate Movie or Episode Files

Running FileBot from the console, Groovy scripting, shell scripts, etc
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Find Duplicate Movie or Episode Files

Post by rednoah »

Assuming that all files have been processed with FileBot, xattr will contain lots of useful metadata that makes finding duplicate movies or episodes trivial.

e.g. list duplicates

Code: Select all

filebot -script fn:duplicates /path/to/files
OUTPUT:

Code: Select all

[*] Avatar (2009)
[+] 1. Avatar.1080p.mp4
[-] 2. Avatar.720p.mp4

e.g. print metadata object for all video files and count occurances:

Code: Select all

filebot -mediainfo -r /path/to/files --format "{xattr}" --filter "file.video" | sort | uniq -c | sort -n
OUTPUT:

Code: Select all

      2 Avatar (2009)
:arrow: Script Repository / Find Duplicate Movie or Episode Files
:idea: Please read the FAQ and How to Request Help.
CooLTanG
Posts: 21
Joined: 26 Jul 2016, 09:26

Re: Find Duplicate Movie or Episode Files

Post by CooLTanG »

This technically only finds exact duplicates right? Not repacks of the same episode, nor does it find same movie with different formats? My sort and uniq flags don't work for windows I guess.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

It'll find Movie/Episode duplicates.

e.g.

Code: Select all

filebot -mediainfo -r . --format "{xattr}" --filter "file.video"

Code: Select all

Alias - 1x01 - Truth Be Told
Avatar (2009)
Avatar (2009)
uniq -c is for counting unique occurences:

Code: Select all

   1 Alias - 1x01 - Truth Be Told
   2 Avatar (2009)
There's 2 files with the same xattr metadata object. We've found a duplicate.


The fn:duplicates script will do pretty much exactly that:
https://github.com/filebot/scripts/blob ... tes.groovy
:idea: Please read the FAQ and How to Request Help.
CooLTanG
Posts: 21
Joined: 26 Jul 2016, 09:26

Re: Find Duplicate Movie or Episode Files

Post by CooLTanG »

alright thanks, I'll expand from the duplicates script.
CooLTanG
Posts: 21
Joined: 26 Jul 2016, 09:26

Re: Find Duplicate Movie or Episode Files

Post by CooLTanG »

hate to drag up my old post again, but I was wondering why it fails on the root of the drive?

aka:
-r e:\movies\ (works perfectly)
-r e:\ (fails to produce any results)

(doing a recursive search (-r))
(oh and e:/ still fails, if the blacklash is the problem)
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

:idea: The duplicate script will ignore files that are not xattr tagged.

Look at the xattr for the files on each drive: viewtopic.php?f=4&t=5#p5394
:idea: Please read the FAQ and How to Request Help.
CooLTanG
Posts: 21
Joined: 26 Jul 2016, 09:26

Re: Find Duplicate Movie or Episode Files

Post by CooLTanG »

Yeah I understand that, but shouldn't it still recursively go through

E:\ to E:\movies,

which then contains the xattr data of each of the movies? The folders don't contain any of the xattr data I'm assuming....


Cuz this produces nothing:
filebot -script fn:duplicates -r e:/

And this works perfectly:
filebot -script fn:duplicates -r e:/movies/
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Windows issues. CMD issues. Permission issues. Could be anything, but nothing FileBot could do anything about.

Use simple commands like this to see what FileBot sees:

Code: Select all

filebot -mediainfo /path --format "{f}"
:idea: Please read the FAQ and How to Request Help.
CooLTanG
Posts: 21
Joined: 26 Jul 2016, 09:26

Re: Find Duplicate Movie or Episode Files

Post by CooLTanG »

filebot -mediainfo e:\ --format "{f}"

Shows all the files in the main folder

filebot -mediainfo -r e:\ --format "{f}"

Shows nothing, is there something wrong with the recursive flag?
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Try the latest revision and see if the behaviour is the same:
viewtopic.php?f=7&t=1609

Are there any notable difference between the drive where -r works and the one where it does not? Is movies a hidden folder?
:idea: Please read the FAQ and How to Request Help.
CooLTanG
Posts: 21
Joined: 26 Jul 2016, 09:26

Re: Find Duplicate Movie or Episode Files

Post by CooLTanG »

That seems to work with the latest version for the -mediainfo flag, even on the Root of the drive.

Although the duplicate script does not work at all now, root or even the Movies folder.

aka:
filebot -script fn:duplicates -r --action test "F:\"
or
filebot -script fn:duplicates -r --action test "F:\Movies\"
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Output?

Code: Select all

filebot -script fn:xattr "F:"
Output?

Code: Select all

filebot -script fn:duplicates "F:"
:idea: Please read the FAQ and How to Request Help.
CooLTanG
Posts: 21
Joined: 26 Jul 2016, 09:26

Re: Find Duplicate Movie or Episode Files

Post by CooLTanG »

Those both work, guess it wasn't the -r flag, it was --action test and it doesn't like either "F:/" or "F:\", only "F:"

They also both go recursively with or without the -r flag.

Thanks for getting it all worked out, I'm going to fix that up in my duplicates program for it to run again.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

--action test disables xattr / history / etc so you can't use it in combination with the duplicates / xattr scripts that rely on these features.
:idea: Please read the FAQ and How to Request Help.
sabinder62
Posts: 19
Joined: 24 Oct 2016, 08:56

Re: Find Duplicate Movie or Episode Files

Post by sabinder62 »

Hi, I have just started using the fn:duplicates script as I usually end up with a 720p and 1080p version of my TV episodes. These episodes are renamed using amc with the following command:

Code: Select all

filebot -script fn:amc --output "/mnt/Files/Transmission/Processed" --action copy --conflict override -non-strict "/mnt/Files/Transmission/Complete" --log-file amc.log --def excludeList=amc.txt clean=y xbmc=localhost unsorted=n artwork=n subtitles=en deleteAfterExtract=y storeReport=y "seriesFormat=/mnt/TV/{n}/Season {s}/{n} - {sxe} - {t} ({vf})" "movieFormat=/mnt/Movies/{n} ({y})/{n} ({y}) ({vf})"
As you can see the, resultant episodes end up with the quality (via {vf}) appended to the end, for e.g.:

/mnt/TV/Firefly/Season 1/Firefly - 1x01 - Unchartered (720p).mp4
/mnt/TV/Firefly/Season 1/Firefly - 1x01 - Unchartered (1080p).mkv

I was hoping to use the duplicates script to remove the 720p version from my TV folders a couple of times a week. However whenever I run it on any folder containing duplicates, the script seems to always just delete the 1080p version, or whichever comes last, rather than select the higher quality. I understand it works off xattributes, but when I run the below command, I don't see video quality mentioned in the output, so I am unsure how to apply a filter to help get this right:

Code: Select all

filebot -script fn:xattr /mnt/TV/Modern\ Family/
OUTPUT:

Code: Select all

/mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
	net.filebot.filename: modern.family.s08e03.blindsided.1080p.web.dl.6ch.hevc.x265.rmteam.mkv
	net.filebot.metadata: {"@type":"net.filebot.web.Episode","seriesName":"Modern Family","season":8,"episode":3,"title":"Blindsided","absolute":169,"special":null,"airdate":{"year":2016,"month":10,"day":5},"seriesInfo":{"database":"TheTVDB","order":"Airdate","language":"en","id":95011,"name":"Modern Family","aliasNames":["Moderni perhe","Współczesna rodzina","Modern Család","Американская семейка","משפחה מודרנית","モダン・ファミリー","Família Moderna","摩登家庭","Taková moderní rodinka","Moderna obitelj","모던 패밀리"],"certification":"TV-PG","startDate":{"year":2009,"month":9,"day":23},"genres":["Comedy"],"network":"ABC (US)","rating":8.7,"ratingCount":298,"runtime":25,"status":"Continuing"}}
/mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
	net.filebot.filename: Modern.Family.S08E03.720p.HDTV.x264-FLEET[eztv].mkv
	net.filebot.metadata: {"@type":"net.filebot.web.Episode","seriesName":"Modern Family","season":8,"episode":3,"title":"Blindsided","absolute":169,"special":null,"airdate":{"year":2016,"month":10,"day":5},"seriesInfo":{"database":"TheTVDB","order":"Airdate","language":"en","id":95011,"name":"Modern Family","aliasNames":["Moderni perhe","Współczesna rodzina","Modern Család","Американская семейка","משפחה מודרנית","モダン・ファミリー","Família Moderna","摩登家庭","Taková moderní rodinka","Moderna obitelj","모던 패밀리"],"certification":"TV-PG","startDate":{"year":2009,"month":9,"day":23},"genres":["Comedy"],"network":"ABC (US)","rating":8.7,"ratingCount":298,"runtime":25,"status":"Continuing"}}
This is the command I am running for the duplicates script with it's resultant output:

Code: Select all

 filebot -script fn:duplicates filebot -script fn:duplicates /mnt/TV/Modern\ Family/ 
OUTPUT:

Code: Select all

[*] Modern Family - 8x03 - Blindsided
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
As you can see, it's picking the 1080p for deletion, and I can confirm this is what it does when I add "--action delete" to the command. Any chance you can help me out?

Cheers
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Please try with 3 to 4 files that have the same Episode group. Maybe it's just sorting the wrong way around. :D
:idea: Please read the FAQ and How to Request Help.
sabinder62
Posts: 19
Joined: 24 Oct 2016, 08:56

Re: Find Duplicate Movie or Episode Files

Post by sabinder62 »

rednoah wrote:Please try with 3 to 4 files that have the same Episode group. Maybe it's just sorting the wrong way around. :D
Do you mean of the same Episode or TV Show, sorry? I never have more than two per episode, 720p and 1080p as those are the only qualities that download. Here is the full output of my earlier Modern Family duplicate check:

Code: Select all

[*] Modern Family - 8x02 - A Stereotypical Day
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (1080p).mkv
[*] Modern Family - 8x03 - Blindsided
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
[*] Modern Family - 8x04 - Weathering Heights
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (1080p).mkv
Done ヾ(@⌒ー⌒@)ノ
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Try with dev:duplicates and see if that works.
:idea: Please read the FAQ and How to Request Help.
sabinder62
Posts: 19
Joined: 24 Oct 2016, 08:56

Re: Find Duplicate Movie or Episode Files

Post by sabinder62 »

rednoah wrote:Try with dev:duplicates and see if that works.

Booya! You are a f***ing legend, as per usual! Thanks very much for working on that. Just out of curiosity, what does the script check for, file size and resolution or something else?

Latest Result:

Code: Select all

[*] Modern Family - 8x02 - A Stereotypical Day
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (1080p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (720p).mkv
[*] Modern Family - 8x03 - Blindsided
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
[*] Modern Family - 8x04 - Weathering Heights
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (1080p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (720p).mkv
Done ヾ(@⌒ー⌒@)ノ
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Comparator chain as follows:
1. REPACK/PROPER (1 for match, 0 for no match)
2. Resolution (width * height or 0 if MediaInfo fails)
3. FileSize
:idea: Please read the FAQ and How to Request Help.
sabinder62
Posts: 19
Joined: 24 Oct 2016, 08:56

Re: Find Duplicate Movie or Episode Files

Post by sabinder62 »

Sounds like the perfect way to check.

So I hate to be that guy but it stopped working over the weekend and is now throwing this error:

Code: Select all

No such property: DESCENDING_ORDER for class: net.filebot.media.VideoQuality
groovy.lang.MissingPropertyException: No such property: DESCENDING_ORDER for class: net.filebot.media.VideoQuality
	at Script1$_run_closure3.doCall(Script1.groovy:11)
	at Script1.run(Script1.groovy:7)
	at net.filebot.cli.ScriptShell.evaluate(ScriptShell.java:62)
	at net.filebot.cli.ScriptShell.runScript(ScriptShell.java:72)
	at net.filebot.cli.ArgumentProcessor.runScript(ArgumentProcessor.java:113)
	at net.filebot.cli.ArgumentProcessor.run(ArgumentProcessor.java:28)
	at net.filebot.Main.main(Main.java:124)
Failure (°_°)
I thought it may be because you have deployed to the release branch so i tried:

Code: Select all

filebot -script fn:duplicates /mnt/TV
But that still seems to be behaving as always, deleting the lower quality one instead.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Looks like some of the latest changes aren't backwards compatible with the 4.7.2 release.

Unfortunately, for that particular script you'll need the latest build now: viewtopic.php?f=7&t=1609
:idea: Please read the FAQ and How to Request Help.
sabinder62
Posts: 19
Joined: 24 Oct 2016, 08:56

Re: Find Duplicate Movie or Episode Files

Post by sabinder62 »

rednoah wrote:Looks like some of the latest changes aren't backwards compatible with the 4.7.2 release.

Unfortunately, for that particular script you'll need the latest build now: viewtopic.php?f=7&t=1609
Yup, can confirm it's working in 4.7.3 b5. Thanks again!
hevalito
Posts: 4
Joined: 08 Dec 2016, 08:20

Re: Find Duplicate Movie or Episode Files

Post by hevalito »

Hey guys – Sorry, I'm a FileBot newb. I just found this thread and I would like to "activate" this behaviour on my synology nas too, since I'm getting duplicate titles from different groups in my feed. I'd like FileBot to always take the "best" option and remove the others.
Lets say I already have "Movie X 1080p" and another "Movie X 1080p" appears in the downloads folder. – How can I make sure, it identifies the new one as dupe and ideally removes it?
I'm using FileBot Node and run a Scheduler every 30 mins.

Thanks a lot for your help!
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Find Duplicate Movie or Episode Files

Post by rednoah »

Depending on the format you're using, --conflict auto could be a solution that allows you to skip/override files if the new file is better.

The duplicates script is another option to get rid of duplicates if these duplicates have been tagged by FileBot already and you have multiple files that are the same movie/episode but with different file paths. You'll need basic command-line skills to test and automate calls for the duplicates script.
:idea: Please read the FAQ and How to Request Help.
Post Reply