Page 1 of 1

AMC - Make FileBot not output potential duplicate files

Posted: 18 May 2016, 18:52
by qvazzler
FileBot is great.
Unfortunately Flexget, which I'm using, has no built-in support for season pack downloading.

I work around this by applying really ugly hacks to Flexget.
It works, but there's a risk that season packs sometimes get downloaded despite that episodes have already been downloaded within those seasons.

I modified the naming standard to include release group, because it's easier to find subtitles that way.

Code: Select all

A Show - S01E01 - Pilot - COOLGROUP.mkv
The problem is that if a season pack is downloaded, it usually has a different release group. The output folder ends up with:

Code: Select all

A Show - S01E01 - Pilot - COOLGROUP.mkv
A Show - S01E01 - Pilot - OTHERCOOLGROUP.mkv
I don't think I could get away with removing the release group. It would be too hard to find correct subtitles when FileBot sometimes fails to get it (really new releases etc)

Would it be possible somehow to let FileBot parse whatever's already in the output directory and decide not to process the file, if it already exists?
I know this is a long shot, but still checking.

Flexget does not have physical access to the directories, so it wouldn't be able to even attempt to check for potential duplicates.

If this fails, I will probably turn to Kodi to let it find and erase duplicates somehow.

Re: AMC - Make FileBot not output potential duplicate files

Posted: 18 May 2016, 19:26
by rednoah
1.
FileBot stores the original filename as xattr, so you can always check it later. Also, if you use FileBot to fetch subtitles, it will take the original filename into account.

2.
The amc script does not deal with "semantic" duplicates. FileBot stores metadata as xattr, so it would be trivial to find duplicates that way, and deal with them. Some scripting required.

Re: AMC - Make FileBot not output potential duplicate files

Posted: 18 May 2016, 22:16
by qvazzler
rednoah wrote:1.
FileBot stores the original filename as xattr, so you can always check it later. Also, if you use FileBot to fetch subtitles, it will take the original filename into account.

2.
The amc script does not deal with "semantic" duplicates. FileBot stores metadata as xattr, so it would be trivial to find duplicates that way, and deal with them. Some scripting required.
Can I keep using your amc that you continuously update or will I have to make my own offline version?

Re: AMC - Make FileBot not output potential duplicate files

Posted: 19 May 2016, 05:38
by rednoah
Did I suggest making any changes to the amc script? (Hint: I didn't) If not, then you don't need a local fork. :D

I'm saying that keeping the group in the filename doesn't make a difference. FileBot will find the same subtitles either way, it just helps you to confirm the the subtitle filebot selects are in fact the ones that match your files best. :lol:

Re: AMC - Make FileBot not output potential duplicate files

Posted: 19 May 2016, 16:25
by qvazzler
rednoah wrote:FileBot will find the same subtitles either way
I still need the ability to download subtitles by myself manually in Kodi. FileBot is without a doubt great at getting subtitles, but sometimes I need chinese ones, and sometimes the ep's are quite new. Sometimes it doesn't work also, but that is on me, not on FileBot.

What I have an express need for is for FileBot to not output any file if there is already one present for the same show, season and episode.

I basically do not want the below to happen.

Code: Select all

A Show - S01E01 - Pilot - COOLGROUP.mkv
A Show - S01E01 - Pilot - OTHERCOOLGROUP.mkv
Since I cannot do anything about the origin of the issue (FlexGet not parsing season packs with regular downloads), it would be great if FileBot could somehow detect that there's already "A Show - S01E01 - Pilot" no matter what the group name is, and simply not output the duplicate to that folder (with the different group).

Re: AMC - Make FileBot not output potential duplicate files

Posted: 19 May 2016, 17:45
by rednoah
1.
What I have an express need for is for FileBot to not output any file if there is already one present for the same show, season and episode.
This does come up once in awhile in one form or another. It's the same "issue" if just the extension is different. However, the amc script facilitate anything like that at this point. I'd recommend writing a custom script that detect and deletes duplicates every once in awhile.

e.g. list files per unique movie:

Code: Select all

args.getFiles{ it.isVideo() }.groupBy{ it.metadata }.each{ m, fs ->
	println "$m => $fs.name"
}

2.
This is a good tutorial for automating subtitles in addition to the amc script:
viewtopic.php?f=13&t=3599

If you want subtitles, then it's highly recommended that you use the suball script and run it on a schedule once a day in addition to the amc script.

Re: AMC - Make FileBot not output potential duplicate files

Posted: 20 May 2016, 07:36
by rednoah
I've added a script for that:
https://github.com/filebot/scripts/blob ... tes.groovy

Try this:

Code: Select all

filebot -script dev:duplicates /path/to/media
And if it looks good you can add --action delete to delete the lowest-order duplicates.

Re: AMC - Make FileBot not output potential duplicate files

Posted: 20 May 2016, 18:01
by qvazzler
rednoah wrote:I've added a script for that:
https://github.com/filebot/scripts/blob ... tes.groovy

Try this:

Code: Select all

filebot -script dev:duplicates /path/to/media
And if it looks good you can add --action delete to delete the lowest-order duplicates.
Awesome, will try when I get home! Is it recursive?

As in, can I point it at Disk1/TV Shows and expect it to check inside ShowName/Season ##?

Re: AMC - Make FileBot not output potential duplicate files

Posted: 20 May 2016, 18:40
by rednoah
Awesome, will try when I get home! Is it recursive?
Untested but I think so. It doesn't care if your "duplicates" are in the same folder or not. It'll only work on files that have been xattr tagged by filebot though.

Re: AMC - Make FileBot not output potential duplicate files

Posted: 21 May 2016, 04:57
by Achandab
script not found? error below

/var$ filebot -script dev:duplicates /volume1/media/movies
FileNotFoundException: https://raw.githubusercontent.com/fileb ... tes.groovy
java.io.FileNotFoundException: https://raw.githubusercontent.com/fileb ... tes.groovy
at net.filebot.web.WebRequest.fetch(WebRequest.java:123)
at net.filebot.web.WebRequest.fetchIfModified(WebRequest.java:101)
at net.filebot.web.CachedResource.fetchData(CachedResource.java:28)
at net.filebot.web.CachedResource.fetchData(CachedResource.java:11)
at net.filebot.web.AbstractCachedResource.fetch(AbstractCachedResource.java:137)
at net.filebot.web.AbstractCachedResource.get(AbstractCachedResource.java:82)
at net.filebot.cli.ArgumentProcessor$DefaultScriptProvider.fetchScript(ArgumentProcessor.java:210)
at net.filebot.cli.ScriptShell.runScript(ScriptShell.java:82)
at net.filebot.cli.ArgumentProcessor.process(ArgumentProcessor.java:116)
at net.filebot.Main.main(Main.java:169)
Failure (°_°)

Re: AMC - Make FileBot not output potential duplicate files

Posted: 21 May 2016, 05:44
by rednoah
Using the latest version of FileBot at all times is most highly recommended. ;)

Re: AMC - Make FileBot not output potential duplicate files

Posted: 21 May 2016, 14:14
by Achandab
using 4.6.1 on Synology NAS. no update available?

do i need to install the update manually?

Re: AMC - Make FileBot not output potential duplicate files

Posted: 21 May 2016, 14:25
by Achandab
okay just realized you have a new repository. Updating now. Thanks!

Re: AMC - Make FileBot not output potential duplicate files

Posted: 21 May 2016, 15:08
by Achandab
okay it says done after a few seconds, but nothing is actually deleted

Re: AMC - Make FileBot not output potential duplicate files

Posted: 21 May 2016, 22:10
by rednoah
Command? Output?

Re: AMC - Make FileBot not output potential duplicate files

Posted: 22 May 2016, 01:57
by Achandab
admin@AhmedsMedia:/var$ filebot -script dev:duplicates --action delete /ahmedsmedia/Media/Movies/
Done ヾ(@⌒ー⌒@)ノ

Re: AMC - Make FileBot not output potential duplicate files

Posted: 22 May 2016, 04:31
by rednoah
I guess none of the files have xattr then. The script will only work if files have been renamed with filebot and if xattr metadata has been set at that time. That will not work if the filesystem doesn't support xattr in the first place of course.

Code: Select all

filebot -script fn:sysinfo | grep "Extended Attributes"

Re: AMC - Make FileBot not output potential duplicate files

Posted: 22 May 2016, 08:07
by Achandab
oh i see, maybe they are being renamed by couch potato then...

Re: AMC - Make FileBot not output potential duplicate files

Posted: 22 May 2016, 08:13
by rednoah
You could easily modify the script and have it group-by filename-without-release-group or something like that. But that's not a use case I'll want to maintain. :lol:

Re: AMC - Make FileBot not output potential duplicate files

Posted: 22 May 2016, 11:51
by Achandab
haha i'll just use an application and run it every now and then. No big deal.

Re: AMC - Make FileBot not output potential duplicate files

Posted: 19 Mar 2024, 18:02
by rednoah