Page 1 of 2

Find Duplicate Movie or Episode Files

Posted: 26 Jul 2016, 12:58
by rednoah
Assuming that all files have been processed with FileBot, xattr will contain lots of useful metadata that makes finding duplicate movies or episodes trivial.

e.g. list duplicates

Code: Select all

filebot -script fn:duplicates /path/to/files
OUTPUT:

Code: Select all

[*] Avatar (2009)
[+] 1. Avatar.1080p.mp4
[-] 2. Avatar.720p.mp4

e.g. print metadata object for all video files and count occurances:

Code: Select all

filebot -mediainfo -r /path/to/files --format "{xattr}" --filter "file.video" | sort | uniq -c | sort -n
OUTPUT:

Code: Select all

      2 Avatar (2009)
:arrow: Script Repository / Find Duplicate Movie or Episode Files

Re: Find Duplicate Movie or Episode Files

Posted: 28 Jul 2016, 07:32
by CooLTanG
This technically only finds exact duplicates right? Not repacks of the same episode, nor does it find same movie with different formats? My sort and uniq flags don't work for windows I guess.

Re: Find Duplicate Movie or Episode Files

Posted: 28 Jul 2016, 08:24
by rednoah
It'll find Movie/Episode duplicates.

e.g.

Code: Select all

filebot -mediainfo -r . --format "{xattr}" --filter "file.video"

Code: Select all

Alias - 1x01 - Truth Be Told
Avatar (2009)
Avatar (2009)
uniq -c is for counting unique occurences:

Code: Select all

   1 Alias - 1x01 - Truth Be Told
   2 Avatar (2009)
There's 2 files with the same xattr metadata object. We've found a duplicate.


The fn:duplicates script will do pretty much exactly that:
https://github.com/filebot/scripts/blob ... tes.groovy

Re: Find Duplicate Movie or Episode Files

Posted: 28 Jul 2016, 08:41
by CooLTanG
alright thanks, I'll expand from the duplicates script.

Re: Find Duplicate Movie or Episode Files

Posted: 16 Oct 2016, 09:19
by CooLTanG
hate to drag up my old post again, but I was wondering why it fails on the root of the drive?

aka:
-r e:\movies\ (works perfectly)
-r e:\ (fails to produce any results)

(doing a recursive search (-r))
(oh and e:/ still fails, if the blacklash is the problem)

Re: Find Duplicate Movie or Episode Files

Posted: 16 Oct 2016, 10:24
by rednoah
:idea: The duplicate script will ignore files that are not xattr tagged.

Look at the xattr for the files on each drive: viewtopic.php?f=4&t=5#p5394

Re: Find Duplicate Movie or Episode Files

Posted: 16 Oct 2016, 21:24
by CooLTanG
Yeah I understand that, but shouldn't it still recursively go through

E:\ to E:\movies,

which then contains the xattr data of each of the movies? The folders don't contain any of the xattr data I'm assuming....


Cuz this produces nothing:
filebot -script fn:duplicates -r e:/

And this works perfectly:
filebot -script fn:duplicates -r e:/movies/

Re: Find Duplicate Movie or Episode Files

Posted: 16 Oct 2016, 21:42
by rednoah
Windows issues. CMD issues. Permission issues. Could be anything, but nothing FileBot could do anything about.

Use simple commands like this to see what FileBot sees:

Code: Select all

filebot -mediainfo /path --format "{f}"

Re: Find Duplicate Movie or Episode Files

Posted: 16 Oct 2016, 21:56
by CooLTanG
filebot -mediainfo e:\ --format "{f}"

Shows all the files in the main folder

filebot -mediainfo -r e:\ --format "{f}"

Shows nothing, is there something wrong with the recursive flag?

Re: Find Duplicate Movie or Episode Files

Posted: 17 Oct 2016, 06:37
by rednoah
Try the latest revision and see if the behaviour is the same:
viewtopic.php?f=7&t=1609

Are there any notable difference between the drive where -r works and the one where it does not? Is movies a hidden folder?

Re: Find Duplicate Movie or Episode Files

Posted: 17 Oct 2016, 09:15
by CooLTanG
That seems to work with the latest version for the -mediainfo flag, even on the Root of the drive.

Although the duplicate script does not work at all now, root or even the Movies folder.

aka:
filebot -script fn:duplicates -r --action test "F:\"
or
filebot -script fn:duplicates -r --action test "F:\Movies\"

Re: Find Duplicate Movie or Episode Files

Posted: 17 Oct 2016, 09:45
by rednoah
Output?

Code: Select all

filebot -script fn:xattr "F:"
Output?

Code: Select all

filebot -script fn:duplicates "F:"

Re: Find Duplicate Movie or Episode Files

Posted: 17 Oct 2016, 20:19
by CooLTanG
Those both work, guess it wasn't the -r flag, it was --action test and it doesn't like either "F:/" or "F:\", only "F:"

They also both go recursively with or without the -r flag.

Thanks for getting it all worked out, I'm going to fix that up in my duplicates program for it to run again.

Re: Find Duplicate Movie or Episode Files

Posted: 17 Oct 2016, 20:40
by rednoah
--action test disables xattr / history / etc so you can't use it in combination with the duplicates / xattr scripts that rely on these features.

Re: Find Duplicate Movie or Episode Files

Posted: 24 Oct 2016, 09:42
by sabinder62
Hi, I have just started using the fn:duplicates script as I usually end up with a 720p and 1080p version of my TV episodes. These episodes are renamed using amc with the following command:

Code: Select all

filebot -script fn:amc --output "/mnt/Files/Transmission/Processed" --action copy --conflict override -non-strict "/mnt/Files/Transmission/Complete" --log-file amc.log --def excludeList=amc.txt clean=y xbmc=localhost unsorted=n artwork=n subtitles=en deleteAfterExtract=y storeReport=y "seriesFormat=/mnt/TV/{n}/Season {s}/{n} - {sxe} - {t} ({vf})" "movieFormat=/mnt/Movies/{n} ({y})/{n} ({y}) ({vf})"
As you can see the, resultant episodes end up with the quality (via {vf}) appended to the end, for e.g.:

/mnt/TV/Firefly/Season 1/Firefly - 1x01 - Unchartered (720p).mp4
/mnt/TV/Firefly/Season 1/Firefly - 1x01 - Unchartered (1080p).mkv

I was hoping to use the duplicates script to remove the 720p version from my TV folders a couple of times a week. However whenever I run it on any folder containing duplicates, the script seems to always just delete the 1080p version, or whichever comes last, rather than select the higher quality. I understand it works off xattributes, but when I run the below command, I don't see video quality mentioned in the output, so I am unsure how to apply a filter to help get this right:

Code: Select all

filebot -script fn:xattr /mnt/TV/Modern\ Family/
OUTPUT:

Code: Select all

/mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
	net.filebot.filename: modern.family.s08e03.blindsided.1080p.web.dl.6ch.hevc.x265.rmteam.mkv
	net.filebot.metadata: {"@type":"net.filebot.web.Episode","seriesName":"Modern Family","season":8,"episode":3,"title":"Blindsided","absolute":169,"special":null,"airdate":{"year":2016,"month":10,"day":5},"seriesInfo":{"database":"TheTVDB","order":"Airdate","language":"en","id":95011,"name":"Modern Family","aliasNames":["Moderni perhe","Współczesna rodzina","Modern Család","Американская семейка","משפחה מודרנית","モダン・ファミリー","Família Moderna","摩登家庭","Taková moderní rodinka","Moderna obitelj","모던 패밀리"],"certification":"TV-PG","startDate":{"year":2009,"month":9,"day":23},"genres":["Comedy"],"network":"ABC (US)","rating":8.7,"ratingCount":298,"runtime":25,"status":"Continuing"}}
/mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
	net.filebot.filename: Modern.Family.S08E03.720p.HDTV.x264-FLEET[eztv].mkv
	net.filebot.metadata: {"@type":"net.filebot.web.Episode","seriesName":"Modern Family","season":8,"episode":3,"title":"Blindsided","absolute":169,"special":null,"airdate":{"year":2016,"month":10,"day":5},"seriesInfo":{"database":"TheTVDB","order":"Airdate","language":"en","id":95011,"name":"Modern Family","aliasNames":["Moderni perhe","Współczesna rodzina","Modern Család","Американская семейка","משפחה מודרנית","モダン・ファミリー","Família Moderna","摩登家庭","Taková moderní rodinka","Moderna obitelj","모던 패밀리"],"certification":"TV-PG","startDate":{"year":2009,"month":9,"day":23},"genres":["Comedy"],"network":"ABC (US)","rating":8.7,"ratingCount":298,"runtime":25,"status":"Continuing"}}
This is the command I am running for the duplicates script with it's resultant output:

Code: Select all

 filebot -script fn:duplicates filebot -script fn:duplicates /mnt/TV/Modern\ Family/ 
OUTPUT:

Code: Select all

[*] Modern Family - 8x03 - Blindsided
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
As you can see, it's picking the 1080p for deletion, and I can confirm this is what it does when I add "--action delete" to the command. Any chance you can help me out?

Cheers

Re: Find Duplicate Movie or Episode Files

Posted: 24 Oct 2016, 10:11
by rednoah
Please try with 3 to 4 files that have the same Episode group. Maybe it's just sorting the wrong way around. :D

Re: Find Duplicate Movie or Episode Files

Posted: 24 Oct 2016, 10:31
by sabinder62
rednoah wrote:Please try with 3 to 4 files that have the same Episode group. Maybe it's just sorting the wrong way around. :D
Do you mean of the same Episode or TV Show, sorry? I never have more than two per episode, 720p and 1080p as those are the only qualities that download. Here is the full output of my earlier Modern Family duplicate check:

Code: Select all

[*] Modern Family - 8x02 - A Stereotypical Day
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (1080p).mkv
[*] Modern Family - 8x03 - Blindsided
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
[*] Modern Family - 8x04 - Weathering Heights
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (720p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (1080p).mkv
Done ヾ(@⌒ー⌒@)ノ

Re: Find Duplicate Movie or Episode Files

Posted: 24 Oct 2016, 15:43
by rednoah
Try with dev:duplicates and see if that works.

Re: Find Duplicate Movie or Episode Files

Posted: 24 Oct 2016, 15:52
by sabinder62
rednoah wrote:Try with dev:duplicates and see if that works.

Booya! You are a f***ing legend, as per usual! Thanks very much for working on that. Just out of curiosity, what does the script check for, file size and resolution or something else?

Latest Result:

Code: Select all

[*] Modern Family - 8x02 - A Stereotypical Day
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (1080p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x02 - A Stereotypical Day (720p).mkv
[*] Modern Family - 8x03 - Blindsided
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (1080p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x03 - Blindsided (720p).mkv
[*] Modern Family - 8x04 - Weathering Heights
[+] 1. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (1080p).mkv
[-] 2. /mnt/TV/Modern Family/Season 8/Modern Family - 8x04 - Weathering Heights (720p).mkv
Done ヾ(@⌒ー⌒@)ノ

Re: Find Duplicate Movie or Episode Files

Posted: 24 Oct 2016, 16:10
by rednoah
Comparator chain as follows:
1. REPACK/PROPER (1 for match, 0 for no match)
2. Resolution (width * height or 0 if MediaInfo fails)
3. FileSize

Re: Find Duplicate Movie or Episode Files

Posted: 31 Oct 2016, 14:07
by sabinder62
Sounds like the perfect way to check.

So I hate to be that guy but it stopped working over the weekend and is now throwing this error:

Code: Select all

No such property: DESCENDING_ORDER for class: net.filebot.media.VideoQuality
groovy.lang.MissingPropertyException: No such property: DESCENDING_ORDER for class: net.filebot.media.VideoQuality
	at Script1$_run_closure3.doCall(Script1.groovy:11)
	at Script1.run(Script1.groovy:7)
	at net.filebot.cli.ScriptShell.evaluate(ScriptShell.java:62)
	at net.filebot.cli.ScriptShell.runScript(ScriptShell.java:72)
	at net.filebot.cli.ArgumentProcessor.runScript(ArgumentProcessor.java:113)
	at net.filebot.cli.ArgumentProcessor.run(ArgumentProcessor.java:28)
	at net.filebot.Main.main(Main.java:124)
Failure (°_°)
I thought it may be because you have deployed to the release branch so i tried:

Code: Select all

filebot -script fn:duplicates /mnt/TV
But that still seems to be behaving as always, deleting the lower quality one instead.

Re: Find Duplicate Movie or Episode Files

Posted: 31 Oct 2016, 14:31
by rednoah
Looks like some of the latest changes aren't backwards compatible with the 4.7.2 release.

Unfortunately, for that particular script you'll need the latest build now: viewtopic.php?f=7&t=1609

Re: Find Duplicate Movie or Episode Files

Posted: 31 Oct 2016, 14:42
by sabinder62
rednoah wrote:Looks like some of the latest changes aren't backwards compatible with the 4.7.2 release.

Unfortunately, for that particular script you'll need the latest build now: viewtopic.php?f=7&t=1609
Yup, can confirm it's working in 4.7.3 b5. Thanks again!

Re: Find Duplicate Movie or Episode Files

Posted: 08 Dec 2016, 08:23
by hevalito
Hey guys – Sorry, I'm a FileBot newb. I just found this thread and I would like to "activate" this behaviour on my synology nas too, since I'm getting duplicate titles from different groups in my feed. I'd like FileBot to always take the "best" option and remove the others.
Lets say I already have "Movie X 1080p" and another "Movie X 1080p" appears in the downloads folder. – How can I make sure, it identifies the new one as dupe and ideally removes it?
I'm using FileBot Node and run a Scheduler every 30 mins.

Thanks a lot for your help!

Re: Find Duplicate Movie or Episode Files

Posted: 08 Dec 2016, 15:31
by rednoah
Depending on the format you're using, --conflict auto could be a solution that allows you to skip/override files if the new file is better.

The duplicates script is another option to get rid of duplicates if these duplicates have been tagged by FileBot already and you have multiple files that are the same movie/episode but with different file paths. You'll need basic command-line skills to test and automate calls for the duplicates script.